Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2015-02-24

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 kminooie kripper: it is never normal to have split-brain
00:01 kripper Shouldn't the file versions of this (writing) host be considered as the valid ones?
00:02 kminooie if i understand correctly the vm in which you are running gluster has come back up. right? at that point the volume should heal it self.
00:03 kripper well, I'm traying to bring it up, but it is not possible due to the split-brain
00:03 kminooie and there is no writing host. the fact that both gluster client ( or any client ) and gluster server happen to be on the same metal does not mean anything
00:04 kripper the VM was only running on 1 host
00:04 JoeJulian I can't tell if the VM is the server or a client, if the hypervisor is the server, client, or both. If you're writing through a client mount and one server goes down, the server that didn't go down should have attributes that allow GlusterFS to determine which one is valid.
00:05 kripper no HA enabled here
00:05 kripper ok, that answers my question
00:06 JoeJulian If you're writing to the brick, you're doing it wrong.
00:07 kripper No, I'm using oVirt with a gluster storage domain
00:08 T3 joined #gluster
00:09 brad[] kripper: just for the sake of my education, how'd that answer your question?
00:09 kripper VM is running on Host 1 and writing to a gluster volume with replica-2 located on Host 1 and Host 2
00:10 JoeJulian kripper: The details that allow that to work are stored in ,,(extended attributes) on the file in question.
00:10 glusterbot kripper: (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/
00:12 kminooie glusterbot: :) the link doesn't work
00:13 JoeJulian Yeah, Jeff retired that blog. I'm trying to find where I put the bits I need to know to change that link...
00:13 kminooie anything on the heal failing?
00:14 badone_ joined #gluster
00:15 JoeJulian @change "extended attributed" 2 "s@http.*@http://pl.atyp.us/hekafs.org/index.php/2011/04/glusterfs-extended-attributes/@"
00:15 glusterbot JoeJulian: Error: The command "change" is available in the Factoids, Herald, and Topic plugins.  Please specify the plugin whose command you wish to call by using its name as a command before "change".
00:15 JoeJulian @factoids change "extended attributed" 2 "s@http.*@http://pl.atyp.us/hekafs.org/index.php/2011/04/glusterfs-extended-attributes/@"
00:15 glusterbot JoeJulian: Error: I couldn't find any key "extended attributed"
00:15 JoeJulian @factoids change "extended attributes" 2 "s@http.*@http://pl.atyp.us/hekafs.org/index.php/2011/04/glusterfs-extended-attributes/@"
00:15 glusterbot JoeJulian: The operation succeeded.
00:16 kripper joined #gluster
00:17 JoeJulian "Volume heal failed" doesn't actually mean the heal failed, but rather the info inquiry failed to get a response. You would have to check all your glusterd logs to try to find the reason why.
00:18 JoeJulian Could be related. "pkill -f glustershd" on all your servers, then restart glusterd and see if that 1st error goes away.
00:19 JoeJulian Could also be the nfs service that's throwing that error though. Trying the same thing with that would require losing your nfs mounts.
00:23 kripper JoeJulian: Well, I'm getting a "Gathering list of split brain entries on volume gluster-storage has been successful" and a list of files
00:24 kripper Should I research why this happened (since it is not normal)?
00:25 kripper this is my first split-brain. My explanation is that this occurred after shutting down the VM while paused
00:26 kripper (I guess it was paused because of the quorum)
00:26 Pupeno joined #gluster
00:27 kripper Before doing this, I disabled quorum for client and server
00:27 kripper I'm researching a solution for avoiding replica-3 (as suggested by the oVirt team)
00:28 kripper IMO, there should be a fencing solution avoiding more than one host to write at the same time
00:29 kripper but as I'm realizing now, things are so easy as I thought.
00:30 Pupeno_ joined #gluster
00:30 kripper *are NOT so easy as I thought
00:35 kminooie JoeJulian: so when Robinson says (on the mailing list) : " This is executed after the upgrade on just one machine. 3.6.2 entry  locks are not compatible with versions <= 3.5.3 and 3.6.1 that is the  reason. From 3.5.4 and releases >=3.6.2 it should work fine." this is only for cases that one is running different versions simultaneously, not after a full upgrade of all the servers?
00:39 gildub joined #gluster
00:41 kripper JoeJulian: "The most fearsome case, it should be apparent, is when the xattrs for the same file/directory on two bricks have non-zero counters for each other. This is the infamous “split brain” case, which can be difficult or even impossible for the system to resolve automatically."...
00:43 kripper JoeJulian: Ok, I don't know why this happened, but is it ok to assume that Host 1 (which was running the VM writing to the storage) has the valid files because it is local (no network latency)?
00:43 JoeJulian kminooie: kripper welcome to clustered storage, the world of race conditions and "not as easy as it looked". :)
00:44 JoeJulian kripper: not necessarily. I generally look at the mtimes, the size, the extended attributes before I make a determination. Luckily, with VMs it's usually safe to just pick one. When in doubt, copy the other someplace for safe keeping.
00:44 kminooie :) yup
00:45 kripper JoeJulian: can I disable the replicated bricks and test the VM accessing only the local replica to skip the backup?
00:47 kripper JoeJulian: Is it possible to setup gluster so that it replicates in just one direction (from host-1 to host-2) and to allways trust in the files on host-1?
00:47 JoeJulian My initial reaction was to say no, but if that's the only activity that happens on the volume while you do that, I guess it would be ok.
00:48 JoeJulian You can't make it *more* split-brain.
00:48 kripper JoeJulian: basically, I'm trying to use gluster as a continuos backup solution from host-1 to host-2, where all changes will be done by host-1
00:49 kminooie kripper:  :) you can easily do that by just running rsync manually
00:49 JoeJulian kripper: sort-of, sure. But then what happens when the disk controller fails on host-1 and you run on host-2 for a while, then replace the controller and wipe out all your new work on host-2?
00:49 JoeJulian That just leads to madness.
00:49 kripper kminooie: but it's not continuos
00:50 JoeJulian If you just want a backup tool, look at geo-replication
00:53 T3 joined #gluster
00:54 kripper JoeJulian: In my dream, when host-2 starts replacing host-1, it should run accessing the replicated copy on host-2...Once host-1 recovers, the replication direction should now go from host-2 to host-1
00:54 kripper and fencing would be the key
00:57 DV joined #gluster
00:59 MugginsM joined #gluster
01:01 T3 joined #gluster
01:01 wkf joined #gluster
01:10 kripper I removed the replcated bricks and the VM started fine
01:11 kripper JoeJulian: Is my dream feasible?
01:11 JoeJulian kminooie: to finally answer your question, that email should not apply if all servers and clients are upgraded.
01:12 JoeJulian *but*, if the services were not restarted after the upgrade, you would still be running the old versions of them and stuff wouldn't work correctly.
01:13 JoeJulian kripper: feasible, yes, optimal, no.
01:13 kripper JoeJulian: please explain
01:14 kripper JoeJulian: because of the timeouts involved in detecting and switching primary/secondary servers?
01:15 JoeJulian kripper: The idea of a clustered system for redundancy is that you have the cluster managing that for you. If you want quorum, use server quorum with a 3rd server that isn't necessarily participating in providing storage. Then you have an always on storage system that you can not only fail-over your VM, but you can live-migrate it.
01:15 JoeJulian If you fail-over to a recent backup, you've lost data.
01:16 kminooie ok thanks I'll try to restart everything ( and remove that unused sock file before bringing everything back up ) later today and I guess we'll see what happen :)  and I have definitely restarted everything couple of time already ( after upgrade) but I guess one more time would not hurt
01:17 JoeJulian btw, kripper, if using ovirt with libvirt/kvm, your VMs are communicating directly with *both* servers at all times. Every write happens synchronously.
01:17 JoeJulian It's the main purpose of having clustered storage.
01:18 JustinClift Aha, finally got the release-3.6 branch code running in the bulk regression tests
01:18 JustinClift Should have some results in a few hours :)
01:18 JoeJulian JustinClift: what was the hangup?
01:19 JustinClift http://review.gluster.org/#/c/9728/
01:19 JustinClift And for some unknown reason plymouthd was eating cpu as well
01:19 JoeJulian Ah, no relation to semiosis compile issue then.
01:19 JustinClift Nope
01:19 JoeJulian darn
01:20 JoeJulian I'm sure it's that library Kaleb mentioned. Probably missing from the ubuntu dependencies.
01:21 * JustinClift has no idea.  Haven't been looking into semiosis' issue ;)
01:22 JoeJulian Yeah, when you said "got" and "running" my brain lept in the wrong direction. :D
01:31 nishanth joined #gluster
01:39 elitecoder what's the best way to turn a gfid into the file path
01:39 elitecoder shit like this does not tell me where to look a file up Conflicting entries for <gfid:a66629b1-a3af-4b58-adf5-fa1457c0db2e>/151
01:39 JoeJulian bernux: Initial tests with 1M files on my home network with a cheap switch that can't do jumbo frames and slower commodity drives raid-6 still performed twice as fast as the results you're reporting. Not sure where to point you. Look at the performance from one end to the other. cpu, ram, nic, cabling, switch, etc.
01:40 JoeJulian @lookup
01:40 bala joined #gluster
01:40 JoeJulian @resolver
01:40 glusterbot JoeJulian: I do not know about 'resolver', but I do know about these similar topics: 'gfid resolver'
01:40 JoeJulian @gfid resolver
01:40 glusterbot JoeJulian: https://gist.github.com/4392640
01:41 JoeJulian elitecoder: ^
01:41 elitecoder looking
01:43 sprachgenerator joined #gluster
01:43 elitecoder strange that gluster doesn't have this built in
01:44 JoeJulian file a bug report
01:44 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
01:44 JoeJulian :D
01:44 elitecoder iiii'm pretty sure they know that this is dumb
01:44 elitecoder less sure there's no built-in way to look it up
01:44 elitecoder tomorrow i get to go through all the gfid complaints it's giving
01:44 elitecoder conflicting, blah blah
01:45 elitecoder so this script is going to be ran ... many times
01:45 elitecoder lol
01:45 elitecoder thanks again joe
01:45 elitecoder bugyalater
01:53 chirino joined #gluster
01:54 T3 joined #gluster
01:58 kripper JoeJulian: Thanks, I'm glusterized now :-) oVirt people is suggesting/requiring replica-3. My hope was to do something just with replica-2 which is fine for a fail-over setup, but I guess that replica-3 is the cost of the benefits of a cluster storage system (I couldn't find some explanation why replica-3 would be required).
01:58 kripper Why did you suggest to use server quorum with a 3rd server and with no-storage? Will it prevent split-brains with just two copies (replica-2)?
02:08 harish_ joined #gluster
02:10 JoeJulian yes
02:11 JoeJulian and replica 3 is overkill unless you have a 6 9's availability requirement.
02:11 rjoseph joined #gluster
02:26 kripper JoeJulian: nice to hear that. Can you please suggest some reading for setting up a 3rd no-storage server? I don't even know to call this (replica-2.5?)
02:27 kripper JoeJulian: What about an oVirt 3.5 hosted-engine running on this kind of gluster volume mounted as NFS?
02:28 sprachgenerator joined #gluster
02:29 kripper JoeJulian: I would be glad to test and document this kind of setup
02:31 JoeJulian Why would you use ovirt, libvirt, kvm with NFS? That's like buying a ferarri and letting your cat drive it.
02:31 JoeJulian Use libgfapi.
02:31 JoeJulian :D
02:32 T3 joined #gluster
02:32 JoeJulian @lucky glusterfs server quorum
02:32 glusterbot JoeJulian: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.0/html/Administration_Guide/sect-User_Guide-Managing_Volumes-Quorum.html
02:35 kripper JoeJulian: haha...I agree...just because hosted-engine only supports NFS at the moment
02:36 kripper JoeJulian: 3.6 will support hosted-engine on glusterfs, but release date will be August IIRC
02:39 kripper JoeJulian: I know how to setup server-side quroum, but I'm not sure how to add the 3rd gluster no-storage server
02:40 kripper JoeJulian: must it be peered and added to the volume someway?
02:48 ilbot3 joined #gluster
02:48 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
02:55 sprachgenerator joined #gluster
03:09 bala joined #gluster
03:09 kripper joined #gluster
03:16 T3 joined #gluster
03:26 sprachgenerator_ joined #gluster
03:28 anrao joined #gluster
03:40 aravindavk joined #gluster
03:40 kripper1 joined #gluster
03:44 kanagaraj joined #gluster
03:45 atinmu joined #gluster
03:53 itisravi joined #gluster
03:57 chirino joined #gluster
03:58 bharata-rao joined #gluster
04:00 soumya joined #gluster
04:05 nishanth joined #gluster
04:05 gem joined #gluster
04:11 shubhendu joined #gluster
04:21 kdhananjay joined #gluster
04:22 itpings hi guys
04:22 itpings sup everyone
04:22 itpings was wondering if my howtos has been approved !
04:28 T3 joined #gluster
04:34 JustinClift itpings: Remind me which one? :)
04:35 itpings gluster with urbackup
04:36 JustinClift Ahhh.  I don't know if anyone has specifically read over it or not.
04:36 itpings hmm
04:36 JustinClift Gimme a sec, and I'll take a look quickly now.
04:36 itpings ok
04:36 anoopcs joined #gluster
04:38 JustinClift itpings: k, looking over it, it seems pretty much exactly what we like to see.
04:39 kripper joined #gluster
04:39 JustinClift itpings: Would you be ok to make it into a wiki page on our wiki, then link to it from the main/front index page of the wiki?
04:40 JustinClift We can probably create a new category like "Using Gluster with Backup software", and list this and others (eg BareOS, etc)
04:41 itpings i just want someone to proof read
04:42 jiffin joined #gluster
04:43 nbalacha joined #gluster
04:44 schandra joined #gluster
04:44 ppai joined #gluster
04:48 RameshN joined #gluster
04:53 kripper joined #gluster
04:53 JustinClift itpings: When you say "proof read", do you also want someone to run through the instructions to validate them?
04:53 JustinClift itpings: Or are you meaning "proof read" as in typops/
04:53 schandra joined #gluster
04:54 JustinClift typos/sentence structure, that kind of thing?
04:57 rafi joined #gluster
04:59 JustinClift itpings: Btw, sorry if my wording is lousy.  Am really in need of sleep atm :)
05:00 ndarshan joined #gluster
05:04 jobewan joined #gluster
05:05 jobewan joined #gluster
05:08 Manikandan joined #gluster
05:12 prasanth_ joined #gluster
05:17 kripper joined #gluster
05:17 meghanam joined #gluster
05:18 kripper joined #gluster
05:25 Manikandan joined #gluster
05:33 plarsen joined #gluster
05:33 plarsen joined #gluster
05:33 ndarshan joined #gluster
05:35 kumar joined #gluster
05:40 spandit joined #gluster
05:43 smohan joined #gluster
05:44 ramteid joined #gluster
05:46 vimal joined #gluster
05:48 coredump joined #gluster
05:48 deepakcs joined #gluster
05:49 dusmant joined #gluster
05:50 nbalacha joined #gluster
05:50 hagarth joined #gluster
05:52 plarsen joined #gluster
05:57 itpings sorry about late relpy Justin
05:57 itpings i just want to make sure that people can read it easily
05:57 itpings testing has been done many times so the thing is working fine
05:58 overclk joined #gluster
06:02 maveric_amitc_ joined #gluster
06:05 lalatenduM joined #gluster
06:11 smohan joined #gluster
06:12 atalur joined #gluster
06:13 rp__ joined #gluster
06:15 raghu joined #gluster
06:21 glusterbot News from newglusterbugs: [Bug 1165938] Fix regression test spurious failures <https://bugzilla.redhat.com/show_bug.cgi?id=1165938>
06:23 dusmant joined #gluster
06:24 sputnik13 joined #gluster
06:25 kshlm joined #gluster
06:28 Manikandan joined #gluster
06:34 sputnik13 joined #gluster
06:46 bala joined #gluster
06:54 schandra joined #gluster
06:55 aravindavk joined #gluster
07:05 sputnik13 joined #gluster
07:13 sputnik13 joined #gluster
07:16 nbalacha joined #gluster
07:19 jtux joined #gluster
07:27 maveric_amitc_ joined #gluster
07:41 huleboer joined #gluster
07:45 [Enrico] joined #gluster
07:50 mbukatov joined #gluster
07:50 nshaikh joined #gluster
07:52 schandra joined #gluster
08:00 sputnik13 joined #gluster
08:02 [Enrico] joined #gluster
08:16 itisravi joined #gluster
08:24 the-me joined #gluster
08:31 ricky-ti1 joined #gluster
08:31 lalatenduM joined #gluster
08:35 [Enrico] joined #gluster
08:38 shubhendu joined #gluster
08:43 kovshenin joined #gluster
08:46 T3 joined #gluster
08:55 nishanth joined #gluster
08:55 dusmant joined #gluster
09:01 Norky joined #gluster
09:06 liquidat joined #gluster
09:07 T0aD joined #gluster
09:08 nangthang joined #gluster
09:12 prasanth_ joined #gluster
09:13 soumya_ joined #gluster
09:14 Debloper joined #gluster
09:15 harish_ joined #gluster
09:19 Slashman joined #gluster
09:21 awerner joined #gluster
09:22 glusterbot News from newglusterbugs: [Bug 1193636] [DHT:REBALANCE]: xattrs set on the file during rebalance migration will be lost after migration is over <https://bugzilla.redhat.com/show_bug.cgi?id=1193636>
09:22 sputnik13 joined #gluster
09:32 nishanth joined #gluster
09:38 kumar joined #gluster
09:40 ctria joined #gluster
09:44 deniszh joined #gluster
09:52 glusterbot News from newglusterbugs: [Bug 1195646] Add a new API to get first Changelog entry in latest HTIME file <https://bugzilla.redhat.com/show_bug.cgi?id=1195646>
09:55 ricky-ticky2 joined #gluster
10:01 ricky-ti1 joined #gluster
10:03 LinuxChef joined #gluster
10:04 LinuxChef Where do I report link-spam in the VCA viewer as done by this user: https://forge.gluster.org/~senlldy ?
10:06 ndevos LinuxChef: I think JustinClift can handle that, he would become online later, or send an email to gluster-infra@gluster.org
10:06 LinuxChef Thank you, I'll send him a mail.
10:07 ndevos LinuxChef: thanks for reporting!
10:07 LinuxChef you're welcome!
10:08 LinuxChef my mail is held back until the moderated reviews it
10:20 ndarshan joined #gluster
10:20 LebedevRI joined #gluster
10:21 ricky-ticky2 joined #gluster
10:22 glusterbot News from newglusterbugs: [Bug 1113460] after enabling quota, peer probing fails on glusterfs-3.5.1 <https://bugzilla.redhat.com/show_bug.cgi?id=1113460>
10:22 glusterbot News from newglusterbugs: [Bug 1192075] libgfapi clients hang if glfs_fini is called before glfs_init <https://bugzilla.redhat.com/show_bug.cgi?id=1192075>
10:22 glusterbot News from newglusterbugs: [Bug 1192378] Disperse volume: client crashed while running renames with epoll enabled <https://bugzilla.redhat.com/show_bug.cgi?id=1192378>
10:22 glusterbot News from newglusterbugs: [Bug 1192435] server crashed during rebalance in dht_selfheal_layout_new_directory <https://bugzilla.redhat.com/show_bug.cgi?id=1192435>
10:28 LebedevRI joined #gluster
10:41 vimal joined #gluster
10:48 ira joined #gluster
10:48 georgeh-LT2 joined #gluster
10:51 ira joined #gluster
10:52 glusterbot News from newglusterbugs: [Bug 1195668] Perf:  DHT errors filling logs when perf tests are run. <https://bugzilla.redhat.com/show_bug.cgi?id=1195668>
10:54 prasanth_ joined #gluster
11:02 ndarshan joined #gluster
11:02 shubhendu joined #gluster
11:23 ppai joined #gluster
11:27 firemanxbr joined #gluster
11:28 firemanxbr joined #gluster
11:29 soumya_ joined #gluster
11:32 gildub joined #gluster
11:33 firemanxbr joined #gluster
11:35 firemanxbr joined #gluster
11:35 firemanxbr joined #gluster
11:36 soumya joined #gluster
11:37 lalatenduM joined #gluster
11:39 diegows joined #gluster
11:42 sputnik13 joined #gluster
11:45 misc mhh so someone would have a idea of the bandwith used by the current CI ?
11:45 ndarshan joined #gluster
11:52 crashmag joined #gluster
11:53 meghanam joined #gluster
11:56 ndevos REMINDER: Gluster Community Bug Triage meeting starts in a few minutes in #gluster-meeting
11:57 ndevos misc: the test cases only use 'localhost', so bandwidth is quite low, I guess
11:57 ndevos misc: well, there are mock build tests, those would download the RPMs from Fedora/EPEL
11:59 misc ndevos: yeah, so we can estimate installing 1 chroot for each run ?
11:59 misc or there is a local cache
11:59 misc it might be easier maybe to set munin, but I am not root on the jenkins infra :)
11:59 misc JustinClift: ^ :)
12:02 rwheeler joined #gluster
12:03 ndevos misc: I like Zabbix?
12:03 misc ndevos: it seems great, but I think it is a bit harder to automate
12:03 ndevos misc: and the site-defaults.cfg for mock should be able to specify package caching, not sure if that is done
12:03 misc while munin is dead easy
12:04 ndevos misc: I dont know munin :) firemanxbr knows Zabbix :)
12:04 misc but we could go on zabbix for monitoring and stuff, that's the next step after ldap, who is the next step after salt :)
12:05 misc ndevos: I am not against if he set it up
12:05 ndevos misc: we can probably get assistance from puiterwijk for ldap and central auth
12:06 misc ndevos: that part is covered by freeipa
12:06 misc that's really easy
12:06 ndevos misc: I dont know, but I guess any assistance would be welcome, you seem to be pretty bust?
12:06 ndevos uh, busy :)
12:06 misc as I said yesterday, I managed to set freeipa faster than jack sparrow managed to get back the black pearl :)
12:07 misc ndevos: as we all are
12:07 misc there is also some stuf fthat requires planning
12:25 flossie joined #gluster
12:26 flossie Hi just a very small point but thought it's worth mentioning , when you run threw the quick start gide the links at the bottom of pages take you to the old website so gets a bit confusing
12:27 flossie sorry if this is wrong place to share this info
12:31 fl0w0lf joined #gluster
12:31 prasanth_ joined #gluster
12:31 fl0w0lf Hello!
12:31 glusterbot fl0w0lf: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
12:32 nbalacha joined #gluster
12:33 fl0w0lf Does Gluster encrypt networt traffic between nodes? Is there a way to do so?
12:39 prasanth_ joined #gluster
12:40 DV joined #gluster
12:40 ndarshan joined #gluster
12:43 ppai joined #gluster
12:43 ndevos fl0w0lf: you can setup SSL encryption, but that is not very easy to do (and I've never tried it)
12:45 fl0w0lf ndevos: I'm considering running Gluster on several nodes in a datacenter where I don't trust network traffic. There is no way to run it without encryption. Do you have any HOWTOs you can point me at? VPN could be an alternative, but might be unnecessary overhead..
12:45 pkoro joined #gluster
12:47 ndevos fl0w0lf: https://github.com/gluster/glusterfs/blob/master/doc/admin-guide/en-US/markdown/admin_ssl.md
12:47 fl0w0lf ndevos: thx!
12:47 ndevos yw!
12:48 ndevos it actually does not look *that* difficult to setup
12:49 fl0w0lf ndevos: yep, looks manageable
12:51 andreasch joined #gluster
12:52 vimal joined #gluster
12:52 glusterbot News from newglusterbugs: [Bug 1193474] Package libgfapi-python for its consumers <https://bugzilla.redhat.com/show_bug.cgi?id=1193474>
12:52 glusterbot News from newglusterbugs: [Bug 1193767] [Quota] : gluster quota list does not show proper output if executed within few seconds of glusterd restart <https://bugzilla.redhat.com/show_bug.cgi?id=1193767>
12:59 bernux joined #gluster
13:03 elico1 joined #gluster
13:08 nishanth joined #gluster
13:14 bernux joined #gluster
13:19 DV joined #gluster
13:21 atalur joined #gluster
13:23 ndevos REMINDER: in a few minutes, the BitRot G+ Hangout will start - http://goo.gl/dap9rF
13:26 elico joined #gluster
13:29 elico1 joined #gluster
13:29 fl0w0lf left #gluster
13:30 ppai joined #gluster
13:34 aravindavk joined #gluster
13:40 kshlm joined #gluster
13:52 firemanxbr yep Zabbix is very simple for me :D
13:52 firemanxbr I can help us :)
13:55 ndevos firemanxbr: what do you think of Zabbix vs munin?
13:56 firemanxbr ndevos, for me Zabbix is much better, This system have trigger, alerts, maps, and custom screens
13:56 firemanxbr ndevos, moust robust
13:57 ndevos firemanxbr: I use Zabbix at home, and am happy with it :)
13:57 ndevos misc: do you have anything against Zabbix?
13:58 hagarth joined #gluster
14:00 misc ndevos: besides not knowing it that much, no
14:00 misc this and the fact that i do not know how much is it cfgmgmt friendly
14:01 ndevos misc: well, I think firemanxbr knows about salt too, so he can judge about it?
14:01 misc ndevos: sure, but I tried zabbix with ansible and it was not that straightforward :)
14:02 misc maybe I do not know zabbix, as centos people keep telling me how great it is
14:03 ndevos ah, my Zabbix config isnt completely in ansible, only some bit, mainly the Zabbix-agent parts
14:03 misc I tend to put everything I can in cfgmgmt, so I am a bit absolutist on this point
14:04 misc ( like I managed to put phpbb config in puppet, forums admin hated me for that as it was reseting their changes )
14:04 elico joined #gluster
14:04 rotbeard joined #gluster
14:05 misc but let's finish frist to send the email :)
14:05 ndevos oh, I absolutely agree that a real-world deployment should be in some cfgmgmt, I just dont care too much about my home setup
14:05 msmith joined #gluster
14:05 ndevos well, 'yet' that is, I'm only trying ansible since recently
14:08 DV joined #gluster
14:10 bene2 joined #gluster
14:16 virusuy joined #gluster
14:17 virusuy joined #gluster
14:17 navid__ joined #gluster
14:18 meghanam joined #gluster
14:20 dgandhi joined #gluster
14:21 dusmant joined #gluster
14:22 meghanam joined #gluster
14:22 Slashman joined #gluster
14:23 T3 joined #gluster
14:27 wkf joined #gluster
14:29 firemanxbr I like the idea, FreeIPA + Zabbix, robust and integrated.
14:29 nshaikh left #gluster
14:32 snewpy Hi, I'm trying to get gluster volumes to work with qemu using libgfapi using RDMA as non-root.  Everything works fine as root, but when done as a normal user, it hangs in libgfapi getting an -EACCES error trying to write to /dev/infiniband/rdma_cm because it's trying to bind privileged ports
14:33 snewpy in spite of having server.allow-insecure on enabled on the volume and option rpc-auth-allow-insecure on turned on in glusterd.vol (and both the volume and glusterd restarted on all nodes)
14:34 ndevos snewpy: restarting the glusterfsd (brick) process is not sufficient, you really need to stop/start the volume to have the server.allow-insecure option take effect
14:34 snewpy ndevos: i did that too
14:34 ndevos snewpy: oh, then I dont know... rafi left 25 minutes ago, he should be able to help you :-/
14:35 side_control joined #gluster
14:36 Leildin joined #gluster
14:36 Leildin Hello fellow glusturians
14:37 ndevos snewpy: I guess you should send an email to the gluster-users list, rafi is our rdma expert and we can poke him to respond
14:37 georgeh-LT2 joined #gluster
14:37 snewpy ndevos: ok, I will keep an eye out... using tcp it works perfectly, so I'm pretty confident it's not my configuration... also granting cap_net_bind_service allows it to work
14:38 Leildin I hope you guys can help me, I've found nothing online about automatic samba sharing in gluster 3.6, I would like to remove the automatic sharing but can't for the life of me find where
14:39 ndevos Leildin: there are some hook scripts that set it up, /var/lib/glusterd/hooks/... somewhere
14:40 lalatenduM @sambavfs
14:40 glusterbot lalatenduM: http://lalatendumohanty.wordpress.com/2014/02/11/using-glusterfs-with-samba-and-samba-vfs-plugin-for-glusterfs-on-fedora-20/
14:40 lalatenduM Leildin, are you looking for the above link?
14:41 elico1 joined #gluster
14:41 Leildin I'm using centos but any leads will do. I just upgraded from 3.5 and suddenly I have an automatic share done by samba where I already had my own with specific settings.
14:44 social joined #gluster
14:44 ppai left #gluster
15:11 soumya joined #gluster
15:12 wushudoin joined #gluster
15:16 DV joined #gluster
15:25 deepakcs joined #gluster
15:27 bennyturns joined #gluster
15:34 nbalacha joined #gluster
15:38 kshlm joined #gluster
15:38 raz º╲˚\╭ᴖ_ᴖ╮/˚╱º   Y A Y !
15:38 raz Ò_Ó
15:38 raz <o/
15:38 raz (◕︵◕)
15:38 raz ᕕ( ᐛ )ᕗ
15:39 bitpushr joined #gluster
15:40 nbalacha joined #gluster
15:46 ildefonso joined #gluster
15:48 dbruhn joined #gluster
15:54 msmith joined #gluster
15:58 rwheeler joined #gluster
16:10 kshlm joined #gluster
16:19 geerlingguy joined #gluster
16:26 JustinClift misc: Do you need root on the Jenkins infra?
16:29 misc JustinClift: well, to put them in salt, and then in ldap/ipa, it would be better
16:29 misc I will make sure to not disrupt anything :)
16:39 bennyturns joined #gluster
16:41 gem joined #gluster
16:48 Slashman hello, is there a way to mount a glusterfs volume with posix ACL support? my try didn't work...
17:03 kkeithley1 joined #gluster
17:03 JoeJulian snewpy: Are you sure its because of trying to bind to privileged ports, or is it permissions reaching /dev/infiniband/rdma_cm?
17:04 JoeJulian Leildin: There's no automatic samba sharing.
17:05 JoeJulian Wait I just read that you say you see an automatic share...
17:07 T3 joined #gluster
17:07 snewpy JoeJulian: definitely trying to bind privileged ports, setcap'ing cap_net_bind_service on the qemu binary makes it work
17:13 jackdpeterson joined #gluster
17:13 kripper joined #gluster
17:28 JoeJulian Leildin: Apparently automatic samba sharing scripts made their way into the spec file with the default being enabled. <sigh>
17:28 JoeJulian Who do we think we are, now, ubuntu?
17:29 JoeJulian Leildin: Anyway, the way to disable it is "gluster volume set $VOL user.smb disable"
17:30 JoeJulian @learn disable samba sharing as To disable samba sharing on a volume, set user.smb=disabled on that volume.
17:30 glusterbot JoeJulian: The operation succeeded.
17:30 neofob joined #gluster
17:31 JustinClift misc: Can you email me your public key, so I can add you to our Jenkins?
17:31 JustinClift misc: Or point me at a URL to get it from
17:31 JustinClift misc: I know I've gotten it from you before, it's just not showing up in my email history searching ;)
17:32 JoeJulian snewpy: my money's on that being a bug. If a port acquisition fails, it should keep trying until it gets a port, imho.
17:33 y4m4_ joined #gluster
17:37 victori joined #gluster
17:37 snewpy JoeJulian: looking at strace, it's specifically trying to open a privileged port because in rpc/rpc-transport/rdma/src/name.c it doesn't test if bind_insecure is set, unlike the regular socket transport
17:39 T3 joined #gluster
17:39 JoeJulian ding, ding, ding! Winner.
17:39 misc JustinClift: in ~misc/.ssh/authorized_keys on supercolony
17:39 snewpy I don't think there's anything specific about rdma that means that binding an unprivileged port wouldn't work?
17:40 JoeJulian No, I'm sure it's just an oversight.
17:40 JoeJulian Oh, but do file a bug report, please.
17:40 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
17:41 snewpy sweet, I will build a new version with the check added in and see if it fixes it and file a bug with a patch
17:41 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
17:42 Rapture joined #gluster
17:42 JoeJulian snewpy: For submitting patches: ,,(hack)
17:42 glusterbot snewpy: The Development Work Flow is at http://www.gluster.org/community/documentation/index.php/Development_Work_Flow
17:48 JustinClift misc: Tx :)
17:51 snewpy JoeJulian: thx
17:53 JustinClift snewpy: Any interest in creating yourself a Gerrit account, and submitting the patch directly? :)
17:53 JoeJulian JustinClift: ... I just said that. :P
17:53 snewpy JustinClift: yes, working on that at the moment :)
17:53 JustinClift Gah
17:53 JoeJulian hehe
17:54 JustinClift JoeJulian: Sorry, it somehow didn't absorb into my brain
17:54 JustinClift JoeJulian: Is the "Simplified" doc better? http://www.gluster.org/community/documentation/index.php/Simplified_dev_workflow
17:54 rjoseph joined #gluster
17:55 gem joined #gluster
17:56 JoeJulian @factoids change hack 1 "s@http.*@http://www.gluster.org/community/documentation/index.php/Simplified_dev_workflow@"
17:56 glusterbot JoeJulian: The operation succeeded.
17:58 T3 joined #gluster
17:58 JustinClift Cool. :)
17:59 misc JustinClift: so, build.gluster.org can be updated, etc ( like yum-cron, not update to latest rhel ) ?
17:59 misc it is not as critical as gerrit ?
18:02 JustinClift misc: It's absolutely mission critical
18:02 T3 joined #gluster
18:02 JustinClift misc: Both of those servers work together in tandem.  If either stops working, we're screwed
18:03 JustinClift misc: And no, we've not been updating it for anything other than security vulnerabilities.  Manually.
18:03 misc JustinClift: well, you know that it was not kept up to date ?
18:03 misc ouch
18:03 misc ok
18:03 misc why did I sign for this again :) ?
18:03 misc oh yes, drugs and money
18:03 JustinClift Mainly due to the fact that "it's working atm.  we're not sure it'll be working if we update it, as everything is so out of date"
18:04 JoeJulian clone it and update the clone.
18:04 JustinClift JoeJulian: You're completely welcome to do that, and let us know how it goes :)
18:05 misc there is something weird going on with libopenssl
18:05 JustinClift When I asked iWeb if they have some kind of snapshot technology available (it's hosted in iWeb, not Rackspace), they said "no"
18:05 JoeJulian iWeb? poop
18:05 JustinClift So, our closest bet would be to rsync the whole thing somewhere else and try with that
18:06 JustinClift Fairly obviously... I haven't gotten around to it, and it's not near the top of my list
18:06 JoeJulian Mmm...
18:06 misc let's update stuff 1 by one
18:06 JoeJulian I wonder if I could get my son to work on that rather than play games all day.
18:06 misc like updating bash...
18:06 JustinClift Kinda hoping Marcelo's new plan for getting stuff done with make both of these iWeb servers no-longer-needed
18:07 misc JoeJulian: we could gamify the update :)
18:07 JustinClift :)
18:07 misc grinding on Wow, updating package, kinda the same
18:08 JoeJulian I think I would lean toward making a new CentOS7 host on rackspace. Getting individual packages working, and 301'ing those urls to the new host.
18:08 JoeJulian until they're all done, then just changing the dns pointer, obviously.
18:10 misc yeah, but we would need to know what to convert
18:11 JoeJulian My son would be interested in doing that.
18:12 JustinClift JoeJulian: I have no knowledge of your son.  Is he a *nix Admin?
18:13 JoeJulian We'll spin up a CentOS 7 box, and start adding salt states for the packages that are needed...
18:13 JoeJulian He is, and he's got me as a resource.
18:14 JustinClift Cool.  Email Marcelo about it too maybe FYI style so he knows?
18:14 JoeJulian Also, being a new box, there's no risk.
18:14 * JustinClift agrees
18:30 JoeJulian @lucky glusterfs posix acl
18:30 glusterbot JoeJulian: http://www.gluster.org/community/documentation/index.php/Gluster_3.2:_Troubleshooting_POSIX_ACLs
18:32 JoeJulian Slashman: https://github.com/GlusterFS/glusterfs/blob/master/doc/admin-guide/en-US/markdown/admin_ACLs.md
18:36 T0aD joined #gluster
18:36 tlynchpin JoeJulian: i did manage to change to fqdn, it was a bit more fiddling than s/old/new/ - some file renaming
18:38 lalatenduM joined #gluster
18:43 sputnik13 joined #gluster
18:48 JoeJulian tlynchpin: Ah, right! Forgot about the bricks/* info files. Sorry. Glad you got it sorted.
18:49 misc JustinClift: so, I guess there is a good reason to have nginx and apache ?
18:49 Gill_ joined #gluster
18:49 JoeJulian I've not found one yet.
18:50 misc the brick, or the good reason to have 2 servers :) ?
18:50 JoeJulian Though I'm replacing nginx with tengine frequently...
18:50 JoeJulian I've not found a good reason to keep apache in most cases.
18:51 misc I keep apache because I have already written automation around
18:51 misc but most of the time, it doesn't change much
18:52 JoeJulian I prefer the smaller memory footprint, but like everything in this industry, there's a flavor for everyone.
18:53 misc [root@build httpd]# ls /var/www/html/
18:54 misc httpd.conf
18:54 misc mhhh
18:54 misc I think that's not how it should work :)
18:58 misc so, libvirt is running
18:58 misc but is unused since 1 year
18:58 misc thumb up to let it survive, thumb down to kill
18:59 JoeJulian Which machine are we referring to?
18:59 misc build.gluster;org
19:00 JoeJulian My vote is don't waste your time on it. I'll have my son (also Joe Julian) build a new one at Rackspace.
19:01 misc well, I would like to know what is needed in term of services
19:01 misc we have also nrpe, not sure where nagios is
19:02 misc ( and selinux disabled, that's bad )
19:02 JoeJulian yeah... <sigh>
19:03 JoeJulian https://mhayden.spreadshirt.com/
19:03 JoeJulian Everyone needs one of those.
19:03 kminooie :))
19:07 kminooie does it make any difference if the underlying file systems of bricks are different? say ext3 and ext4 ?
19:09 JustinClift misc: No idea about nginx + apache
19:11 JoeJulian kminooie: I can't think of any reason why it should matter off the top of my head. In a replicated volume, the write performance will the that of the slowest replica.
19:14 kminooie ok so lets see if i've got this right. glustershd is gluster self heal daemon, right? and I think some one was talking about this yesterday, what should be done when you see errors like this?
19:14 kminooie E [afr-self-heal-entry.c:239:afr_selfheal_detect_gfid_and_type_mismatch] 0-home-fs-replicate-0: Gfid mismatch detected for <d0a692b7-a762-48bc-96bd-8717ecc597fd/.gtk-bookmarks>, 0d0ada76-e8cb-43aa-9ad8-35c7f16b547f on home-fs-client-1 and 7eaa7ee8-4373-414f-8328-91f0e645b790 on home-fs-client-0. Skipping conservative merge on the file.
19:15 kminooie does 'skipping conservative merge' mean that it is just gonna copy one over the other?
19:16 CyrilPeponnet hey guys, I have a strange issue
19:18 kminooie any btw, does or should self-heal daemon open a port?
19:18 CyrilPeponnet I have a gluster "cluster" running 3.5.2, for a given volume with no client using glusterfs fuse, I can set some properties like enable-ino32. As long as a client is connected to a new volume using glusterfs fuse (3.6), I can't set anymore this option. This have to be done *before* any clients get connected.
19:19 misc JustinClift: nginx serve /d on port 443
19:20 misc will look at apache when I have a better network ( ie, not in train in a tunnel )
19:20 JoeJulian kminooie: You are correct that is the self-heal daemon. gfid mismatch on a file means the file was created independently on those two bricks, assigning a different gfid to each (gfid is a psuedo inode number). So you have, essentially, the same filename assigned to two different inodes. The daemon doesn't know which is right, so it skips it.
19:21 kminooie so I have to go an manually remove one of them?
19:22 CyrilPeponnet and I can reproduce. Stop the client, set the property to the volume, remount on the client (all is fine), try to set again the same property with  the sane value... fail
19:22 CyrilPeponnet volume set: failed: One or more connected clients cannot support the feature being set. These clients need to be upgraded or disconnected before running this command again
19:22 JoeJulian CyrilPeponnet: Correct. Otherwise the volume will have selected an op_version that suits the existing volume and client. Apparently that option isn't supported by that op_version. You an override that with a volume set command (see gluster volume set help and look for op_version).
19:22 CyrilPeponnet this options inode32 is really old
19:22 JoeJulian It is.
19:24 ilbot3 joined #gluster
19:24 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
19:24 CyrilPeponnet the gluster set op_version is not available in 3.5.2
19:24 JoeJulian Well there you go.
19:24 CyrilPeponnet I tried to deal with op_version in /var/lib/gluster but I screwed everything :p
19:24 Rapture joined #gluster
19:25 CyrilPeponnet so the recommandation is to upgrade to 3.5.3 ?
19:25 CyrilPeponnet 3.6 ?
19:25 CyrilPeponnet (centos7 based)
19:26 CyrilPeponnet or at least downgrade clients to use 3.5.3
19:26 CyrilPeponnet 3.5.2
19:26 CyrilPeponnet before the rpc version sheme change
19:26 JoeJulian Either seems to be fine during operation but there's been a number of people with issues post upgrade. I'm not sure if it had anything to do with their order of operations or a bug.
19:27 fattaneh1 joined #gluster
19:27 JoeJulian Since you seem to be able to not have any clients, I would probably stop the volumes, upgrade to 3.6, and go from there.
19:27 CyrilPeponnet 950 clients
19:27 CyrilPeponnet :p
19:27 CyrilPeponnet using nfs
19:28 CyrilPeponnet we tried to use gluster fuse be we had so many issue
19:28 CyrilPeponnet due to op_version
19:28 JoeJulian Hah.
19:29 Scotch joined #gluster
19:29 JoeJulian Can you test operations in staging, or is everything done in production?
19:29 CyrilPeponnet I just made a test with one client to be sure of the behaviour
19:29 CyrilPeponnet well it's production
19:29 CyrilPeponnet but go ahead
19:29 JoeJulian Then I would do 3.5.3.
19:30 CyrilPeponnet Is there a hot upgrade from 3.5.2 to 3.5.3 ?
19:30 CyrilPeponnet My gluster setup use 3 nodes
19:30 JoeJulian move your clients off 1 server, stop glusterd, pkill -f gluster, upgrade, start glusterd, wait for "gluster volume heal $VOL info" to come back clean for all your volumes. Repeat.
19:31 CyrilPeponnet how can I force client to use one server ?
19:31 JoeJulian I'm *guessing* that the people that had problems with a 3.6 upgrade didn't wait for heals to finish, but that's purely speculation.
19:32 JoeJulian I assume you're using a floating ip to provide HA to nfs.
19:32 CyrilPeponnet I mean, I have a vip pointing to one of the node but I now that connexion are dispatched accros the nodes
19:32 CyrilPeponnet ok I guess I can stop one of the replica
19:32 JoeJulian Yeah, you can upgrade any server that doesn't have that vip pointing at it.
19:33 JoeJulian Just make sure heals are completed before taking down a replica.
19:33 JustinClift misc: Ahhh, with nginx serving /d on port 443, that's likely so developers can download the build logs and stuff
19:33 CyrilPeponnet will 3.5.3 able to join a cluster of 3.5.2 ? last tine I did that I got peer rejected
19:33 Scotch hola...any 3.3.2 experts here?  Need advice
19:33 JustinClift misc: We used to have the regression and smoke tests running directly on build.gluster.org
19:33 JoeJulian upgrade. ;)
19:33 JoeJulian CyrilPeponnet: I haven't tried that myself.
19:34 JustinClift misc: So, they'd download the failure logs from that host
19:34 JustinClift misc: Now that the regression tests run on the rackspace slave vm's, they're the ones running nginx for serving up those failure logs
19:35 theron joined #gluster
19:35 JustinClift misc: We do have the smoke tests still running from build.gluster.org though, so developers will still likely need to download stuff from /d, for failed smoke run logs
19:36 CyrilPeponnet @JoeJulian Thanks for those explanations I will try a hot upgrade using my puppet-gluster dev setup.
19:36 Scotch is this a good forum for advice on 3.3.2 oddities?
19:37 bene3 joined #gluster
19:37 JoeJulian Ask away, Scotch.
19:39 Scotch thx...inherited a 3.3.2 cluster.  Created vmware server to expand volume without issue.  Started a remove-brick on a physical server.  Seemed to be running fine but somewhere over the span of several days the vm managed to create .glusterfs in the mount dir on the vm
19:40 JoeJulian @what is this new .glusterfs directory
19:40 Scotch result is the creation of the volume dir structure and migration of "some" files.  Now root file system is full.
19:40 JoeJulian @lucky what is this new .glusterfs directory
19:40 glusterbot JoeJulian: http://joejulian.name/blog/what-is-this-new-glusterfs-directory-in-33/
19:40 JoeJulian Scotch: ^
19:41 JoeJulian Not sure if that's what you're asking about, but there's some info.
19:42 Scotch prob is I had large virtual disks mounted.  It "looks" like the disks were unmounted(?) but gluster continued using mount point, created dirs and .glusterfs in mount "directory" (i.e. NOT the disk that's supposed to be mounted there)
19:42 Scotch read that already :)
19:43 JoeJulian Ah yes. The ol' "forgot to mount the brick" problem. Cured in more recent versions, btw.
19:43 JoeJulian Is this a replicated volume?
19:43 Scotch no replication
19:43 JoeJulian That'll make it a lot more difficult.
19:45 JoeJulian I forget... can you stop a remove-brick back then? "gluster volume remove-brick $vol $brick stop"
19:46 fubada joined #gluster
19:48 Scotch I wish it were due to forgetfulness (at least I would know what the cause was ;) )...mount point was writable, disks were mounted (large TB volumes were present until reboot/failed mount due to full "/")
19:48 Scotch yes
19:49 Scotch tried that, vm in question threw error (that's how I discovered prob)
19:50 elitecoder Any ideas on what it could mean if you have about .. 720 entries like 2015-02-24 19:45:29 /files/secure/2040/contests/151 in your volume heal files info heal-failed list?
19:50 elitecoder : ]
19:50 elitecoder Exact same folder
19:51 spiette joined #gluster
19:51 JustinClift Ugh.  Just noticed some of our regression testing VM's were created with the wrong VM profile.  '2GB General purpose' instead of '2GB Performance'
19:51 JoeJulian elitecoder: look at the timestamp.
19:51 JustinClift Will need need to rebuild a few then
19:51 elitecoder JoeJulian: many time stamps.
19:51 elitecoder hundreds
19:51 JoeJulian precisely. And they're sequential.
19:51 JoeJulian It's a log.
19:52 elitecoder Right. . . .
19:53 JoeJulian Scotch: there's no easy cure for this in that version (and obviously no prevention in that version either).
19:54 JoeJulian Does the brick you were trying to remove still show up in "gluster volume info"?
19:54 glusterbot News from newglusterbugs: [Bug 1058300] VMs do not resume after paused state and storage connection to a gluster domain (they will also fail to be manually resumed) <https://bugzilla.redhat.com/show_bug.cgi?id=1058300>
19:54 Scotch it does
19:54 Scotch status inquiry shows remove-brick complete but only 3.4 of ~20TB having been rebalanced
19:55 T3 joined #gluster
19:55 Scotch brick does not reflect the 3.4TB reduction either
19:56 elitecoder JoeJulian: there are a good amount of files in there, I'm taring the folders, and downloading them to compare them
19:57 lalatenduM joined #gluster
19:57 Scotch is it possible to deal with just the vm/server in question?  peer remove/xattr cleanup/uuid cleanup/reattach (or similar)?
19:57 elitecoder Not sure if that's how the pros do it, so if something sounds stupid let me know lol
19:57 JoeJulian Scotch: Then you can probably just kill glusterfsd on the full VM. Ensure the files that were copied onto the VM still exist on the original brick. If they do, you should be able to safely remove everything under where the brick is *supposed* to mount. "mount -a" to ensure the brick actually mounts where it's supposed to. restart glusterd.
19:58 Scotch even better :-D
19:58 JoeJulian If that works, then you should be able to remove-brick again.
20:00 Scotch we'll see...appreciate the advice
20:03 kripper Hi, I removed bricks from a replica-2 volume via oVirt (reducing replica count)
20:04 kripper oVirt was supposed to move the data from brick 2 to brick 1
20:04 kripper it finished and I pressed the "commit" button which actually removed the second brick
20:04 T3 joined #gluster
20:04 kripper but something went wrong, since now I cannot start some VMs
20:05 kripper I guess some files are missing
20:05 kripper Can I copy them from the removed brick to the gluster volume?
20:11 JustinClift JoeJulian: ^ ?
20:11 MugginsM joined #gluster
20:12 toti joined #gluster
20:13 bene2 joined #gluster
20:14 JoeJulian kripper: Mount the volume via fuse someplace on the server that held that removed brick, and copy the images to the volume through that fuse mount.
20:14 elitecoder What's the most current docs regarding fixing files that failed to heal?
20:15 JoeJulian Also, kripper, file a bug report about that problem. Include the rebalance log(s).
20:15 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
20:15 JoeJulian elitecoder: probably my blog post on split-brain.
20:16 JoeJulian It hasn't changed at all since 3.3.
20:16 elitecoder http://joejulian.name/blog/fixing-split-brain-with-glusterfs-33/ correct?
20:16 JoeJulian Yes
20:16 elitecoder Thankya sir.
20:19 deniszh joined #gluster
20:21 theron_ joined #gluster
20:23 kripper JoeJulian: looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=1136349
20:23 glusterbot Bug 1136349: high, high, ---, spalai, POST , DHT - remove-brick - data loss - when remove-brick with 'start' is in progress, perform rename operation on files. commit remove-brick, after status is 'completed' and few files are missing.
20:25 JoeJulian kripper: True, does look like that. :(
20:30 DV joined #gluster
20:32 JoeJulian WTF?!?! "Tell the admin to copy any missing files" is a solution?!?!
20:33 kripper JoeJulian: I'm making sure that I'm really missing files
20:33 kripper JoeJulian: and it's not a oVirt issue
20:34 kripper JoeJulian: libvirt says:
20:34 kripper 2015-02-24 20:14:11.683+0000: 19733: error : virStorageFileGetMetadataRecurse:952 : Failed to open file '/rhev/data-center/00000002-0002-0002-0002-0000
20:34 kripper 00000183/b55d0462-0dec-41a3-b0fd-6225fc5cf248/images/393a14c3-99df-4ffd-8ded-0705a22d3304/dc9d4128-3172-4259-9923-54db19347fd2': No such file or direct
20:34 kripper ory
20:34 kripper 2015-02-24 20:14:11.683+0000: 19733: error : virStorageFileGetMetadataRecurse:952 : Failed to open file '/rhev/data-center/00000002-0002-0002-0002-000000000183/b55d0462-0dec-41a3-b0fd-6225fc5cf248/images/393a14c3-99df-4ffd-8ded-0705a22d3304/dc9d4128-3172-4259-9923-54db19347fd2': No such file or directory
20:35 kripper But I don't understand what this /rhev/data-center/00000002-0002-0002-0002-000000000183 directory is
20:35 kripper at least, it's not a gluster mount
20:37 kripper The gluster mount (mounted by oVirt) is /rhev/data-center/mnt/glusterSD/host1\:gluster-storage/b55d0462-0dec-41a3-b0fd-6225fc5cf248/images/393a14c3-99df-4ffd-8ded-0705a22d3304
20:37 kripper and it looks fine
20:38 kripper yep, diff says that all files are on the gluster mount, so it looks more like a oVirt issue
20:41 kripper JoeJulian: BTW, do you know where this directory comes from? (/rhev/data-center/00000002-0002-0002-0002-000000000183)
20:42 JoeJulian Nope
20:43 JoeJulian Looks like something ovirt. I don't know much about ovirt. I used it for a day before I scrapped it.
20:44 DV joined #gluster
20:44 snewpy JoeJulian: i hope i did it right... http://review.gluster.org/#/c/9737/
20:45 snewpy turns out it wasn't the lack of checking bind_insecure, but rather that the fallthru of binding an unprivileged port never calls rdma_bind_addr()
20:46 JoeJulian The process looks right. I have no expertise on the code though.
20:48 snewpy cool, thanks again for the help!
20:49 JoeJulian Thank you!
20:54 rwheeler joined #gluster
20:54 glusterbot News from newglusterbugs: [Bug 1195907] RDMA mount fails for unprivileged user without cap_net_bind_service <https://bugzilla.redhat.com/show_bug.cgi?id=1195907>
20:57 kripper JoeJulian: That's a pity. I think oVirt is a great project. The only thing that I hate is that they have no support via IRC. I'm a programmer since I'm 8 years (I'm now 36) and there is so much I could contribute, but without some initial guidance from the devs that will not be possible.
20:58 JoeJulian I encountered a huge memory leak with the remote-root-execution service which made it unusable.
20:58 PaulCuzner joined #gluster
20:58 JoeJulian I also didn't like a remote-root-execution service.
20:59 misc kripper: mhh, what do you mean no support via irc ?
20:59 JoeJulian They're on oftc.
21:01 misc yep
21:01 JustinClift http://www.ovirt.org/Community
21:01 JustinClift "People using oVirt and all those who develop the software are welcome in the #ovirt channel on irc.OFTC.net."
21:01 JustinClift kripper: ^  Am I being thick again, or is that what you're meaning?
21:02 misc JustinClift: why not both :p
21:04 DV joined #gluster
21:07 JoeJulian Same reason I don't host a #gluster channel over there, I suppose. It's just too much work running 2 channels.
21:08 kripper I never got an answer on #ovirt
21:08 kripper I see no activity there
21:09 JustinClift kripper: On OFTC, or the Freenode one?
21:09 kripper this channel is dead
21:09 misc I think it might be easier the day when there is meeting
21:09 kripper on OFTC
21:10 kripper yes, but I don't want to disturb the meeting with newbie questions
21:10 misc well, there will be people
21:10 elitecoder Once a split brain or merge fail is fixed, will it be removed from the log? (volume heal files info heal-failed log)
21:10 kripper I will try the mailing list.
21:10 elitecoder rrr, heal fail. not merge fail.
21:10 misc now, most of the ovirt team is in Europe timezone, so indeed, asking question now is not the right moment, unfortunately
21:11 JoeJulian elitecoder: nope
21:11 JustinClift kripper: Yeah, mailing list is the right avenue when TZ issues hit
21:11 * JustinClift used to have the same problem with a different project, when I used to live in Australia
21:11 JoeJulian Ugh. Hate mailing lists...
21:12 misc JoeJulian: mhh, why ?
21:12 JoeJulian attention span, I think.
21:13 JoeJulian I hate seeing a question that could probably be answered in 5 minutes that takes 2 days.
21:13 JoeJulian And I mean that as the guy giving the answers.
21:14 JoeJulian I find it even more frustrating as the guy with the need to pose a question and have it answered 2 days after I figured it out on my own.
21:16 swebb joined #gluster
21:23 elitecoder heh
21:23 diegows joined #gluster
21:25 DV joined #gluster
21:28 n-st joined #gluster
21:29 elitecoder JoeJulian: Will that snippet for removing files in gluster work if I change it to rm -rf to make it recursive or does each folder and file need to be done individually
21:29 JoeJulian individually.
21:29 JoeJulian Or just use splitmount.
21:29 JoeJulian @splitmount
21:30 JoeJulian seriously, glusterbot?
21:30 JoeJulian @learn splitmount as https://github.com/joejulian/glusterfs-splitbrain
21:30 glusterbot JoeJulian: The operation succeeded.
21:31 JoeJulian I need two more of me...
21:31 elitecoder lol
21:31 T3 joined #gluster
21:32 elitecoder Anything need to be changed for directories? (aside from rm -f ${BRICK}${SBFILE}, which would need to be rmdir or rm -rf)
21:33 JoeJulian If you're using splitmount, you can safely rm -rf and not even have to worry about the gfid files.
21:33 JoeJulian No getfattr, nothing.
21:33 elitecoder ah
21:38 JoeJulian Plus... on the mailing list you get questions that start out like, "We have a gluster volume consisting of a single brick, using replica 2."
21:38 JoeJulian Which is, of course, impossible.
21:40 JoeJulian So now you have a 3 email exchange trying to make sure you (and they) actually know how their volume is configured.
21:41 elitecoder whoa
21:41 elitecoder $ ls
21:41 elitecoder ls: cannot access 151: Input/output error
21:41 elitecoder ls: cannot access 148: Input/output error
21:41 elitecoder ls: cannot access preview: Input/output error
21:41 elitecoder ls: cannot access 97: Input/output error
21:41 elitecoder This is on the client mount point, do split brains do that?
21:41 JoeJulian yep
21:41 elitecoder ok
21:43 JustinClift Hmmm, that's the error that happens with a single brick using replica 2 isn't it? ;)
21:44 JoeJulian hehe
21:44 JustinClift elitecoder: Btw, ignore my smartarse comment to JoeJulian there.  Just in case you're not sure :)
21:44 elitecoder heh i don't have a single brick so i'm just confused lol
21:44 elitecoder wai i get it that's kinda funny
21:44 elitecoder are you talking about a config error specifying two bricks but only having one?
21:45 JoeJulian He's calling back to my raging against mailing list posts.
21:45 elitecoder lol
21:46 elitecoder Does splitmount need to be used with root or
21:46 elitecoder can I just use a regular account with it and mount it in a home folder
21:46 JoeJulian I'm in rage mode this afternoon after reading bug 1136702
21:46 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1136702 unspecified, unspecified, ---, bugs, MODIFIED , Add a warning message to check the removed-bricks for any files left post "remove-brick commit"
21:46 JoeJulian elitecoder: yes, root.
21:47 elitecoder k
21:48 kminooie JoeJulian:  I don't know if you have noticed this but the hashbang that you are using in splitmount.py would not work in debian  ( env is in /usr/bin not bin ) or am I missing something?
21:49 JoeJulian why the f is it in /usr/bin? That's just dumb.
21:50 JoeJulian It should be available without any filesystems being mounted.
21:51 JoeJulian Raise the issue so I don't forget to put in a workaround for that.
21:51 kminooie I am just doing what elitecoder is doing since i am having the exact same problem and I am letting you know what is happening as it happens ( cloned the git repo 2 min ago )
21:51 kminooie sure
21:53 JustinClift JoeJulian: First I'd head of that bug.  Yeah, that seems pretty er... crappy.
21:53 JustinClift Just upped it's priority and hopefully started conversation about it.
21:54 glusterbot News from newglusterbugs: [Bug 1195947] Reduce the contents of dependencies from glusterfs-api <https://bugzilla.redhat.com/show_bug.cgi?id=1195947>
21:55 kminooie JoeJulian: http://ur1.ca/jsk8b   how can I debug this?
21:56 jmarley joined #gluster
21:58 jriano joined #gluster
21:59 JoeJulian Hrm. It must have failed to retrieve the fuse vol.
22:02 elitecoder "Your split replicas are mounted under brickfix, in directories r1 through r2" :D
22:02 JoeJulian Yeah, that's what's supposed to happen.
22:04 DV joined #gluster
22:08 jobewan joined #gluster
22:09 wkf joined #gluster
22:13 kripper does "Transport endpoint is not connected" sound familiar to someone?
22:14 kripper seems to be the cause of a VM not restarting after stopping and restarting a gluster volume
22:19 badone_ joined #gluster
22:24 kminooie so yes the variable orig_vol is empty, which raise the question: what would cause it to fail retrieving volume information?
22:25 JoeJulian kminooie: I'm redoing that whole rpc interface. After I wrote it I had a discussion with Jeff Darcy about better ways of doing that. Hadn't gotten around to implementing that until your need suddenly arose.
22:27 kminooie :) I appreciate that. do still want to file an issue for hashbang?  the workaround is to just use #!env instead of an absolute path
22:28 JoeJulian Yeah, I suppose that's good enough.
22:28 * JoeJulian needs to file a bug
22:28 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
22:28 DV joined #gluster
22:31 elitecoder Ok I fixed some files that failed to heal, I want to clear out these logs though... is there a way to clear these logs: volume heal files info heal-failed (in here)
22:34 JoeJulian restart all glusterd
22:35 kminooie btw https://github.com/joejulian/glusterfs-splitbrain/issues/2  and I am on 3.6.2
22:36 JoeJulian Oh good. I only have 3.5.3 installed and I wanted to test this against 3.6 so that'll be really convenient.
22:37 kminooie I will be happy to test it as soon as you tell me to :)
22:45 elitecoder So I'm having difficulty figuring out how to restart the daemon. It's not in init.d or registered as a 'service', I can't do service gluster restart, etc.
22:45 elitecoder checking ubuntu docs
22:45 semiosis service glusterfs-server restart
22:45 semiosis or restart glusterfs-server, if you prefer
22:45 * semiosis wonders what ubuntu docs would help here
22:46 elitecoder semiosis: using the term loosely
22:46 elitecoder more like googling ubuntu gluster restart
22:46 semiosis @google ubuntu gluster restart
22:46 glusterbot semiosis: High-Availability Storage With GlusterFS 3.2.x On Ubuntu 11.10 ...: <https://www.howtoforge.com/high-availability-storage-with-glusterfs-3.2.x-on-ubuntu-11.10-automatic-file-replication-across-two-storage-servers>; How to install GlusterFS with a replicated volume over 2 nodes on ...:
22:46 glusterbot semiosis: <https://www.howtoforge.com/how-to-install-glusterfs-with-a-replicated-volume-over-2-nodes-on-ubuntu-14.04>; GlusterFS 3.4.1 Packages for Ubuntu Saucy - Gluster Community ...: <http://blog.gluster.org/category/ubuntu/>; QuickStart - GlusterDocumentation - GlusterFS: <http://www.gluster.org/community/documentation/index.php/QuickStart>; ubuntu-glusterfs-3.4 : Louis Zuckerman - (2 more messages)
22:46 elitecoder glusterfs-server: unrecognized service
22:47 semiosis wtf
22:47 elitecoder yeah serious
22:47 semiosis how did you install glusterfs?
22:47 elitecoder I'm sure I used apt-get
22:47 semiosis did you try 'restart glusterfs-server'?
22:49 DV joined #gluster
22:50 elitecoder no
22:50 elitecoder i will
22:51 elitecoder oh there we go it found it that way
22:51 elitecoder doesn't seem to have restarted though
22:55 glusterbot News from resolvedglusterbugs: [Bug 1067059] Support for unit tests in GlusterFS <https://bugzilla.redhat.com/show_bug.cgi?id=1067059>
23:00 DV joined #gluster
23:00 elitecoder Ok cool I restarted both and the heal failed list is empty
23:13 JoeJulian kminooie: there you go
23:15 kminooie semiosis: elitecoder: depending on what version you are using the init.d file used to be called 'glusterd' so service glusterd ...
23:15 kminooie JoeJulian: here we go :D
23:15 elitecoder Do you just use umount when something has been mounted with splitmount?
23:15 semiosis kminooie: that hasn't been the case on debian/ubuntu in almost 4 years :)
23:15 elitecoder or is there a special way to unmount a splitmounted volume
23:16 JoeJulian Just umount
23:16 elitecoder kk
23:17 elitecoder Oh, the bricks have to be umounted individually, rightooo
23:22 kminooie JoeJulian: is there any new dependency. I don't see anything in the setup but I am getting this:  http://ur1.ca/jsku3
23:22 kminooie brb
23:24 JoeJulian Hrm, must be some poorly handled exception.
23:24 JoeJulian kminooie: Do you have libgfapi installed?
23:29 gildub joined #gluster
23:31 neofob joined #gluster
23:33 neofob left #gluster
23:36 kminooie i have glusterfs-client glusterfs-common glusterfs-server packages from http://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/wheezy/apt
23:37 kminooie i assume that they would include libgfapi , is there a way that I can verify that? let me try it on fedora
23:37 Rapture joined #gluster
23:38 kripper left #gluster
23:43 misc is something happening on the gluster infra at rackspace, or i am just unlucky to see them rebooting ?
23:43 kminooie JoeJulian: and btw #!env does not work (sorry about that ) I just did a bit of googleing and the correct path is indeed /usr/bin/env  ( the redhat family has it in both places /bin and /usr/bin, but the 'correct' place is indeed /usr/bin/env )
23:45 kminooie misc: I am not part of that team but today some of them were talking here about some server configuration so I guess they are doing something
23:45 JoeJulian correct, my ass. ;)
23:46 kminooie yeah
23:49 kminooie also I am getting the same error in fedora as well  AttributeError: 'module' object has no attribute 'get_volfile' and on my fedora machine I have these http://ur1.ca/jsl0u installed
23:49 Rapture joined #gluster
23:49 kminooie again from gluster.org
23:49 JoeJulian <sigh> it works for me...
23:49 JoeJulian let me proceed into madness...
23:50 JustinClift misc: Which rebooting?
23:50 * JustinClift just rebooted the backup server, and slave23
23:51 kminooie I don't need to be on any of the peers server to run this right? ( my fedora is not part of the cluster, they are debian )
23:51 JoeJulian Nope
23:52 misc JustinClift: well, I cannot connect to all servers, but maybe that's something else
23:52 JustinClift The backup server has locked down ssh (iptables), if that's the one you're meaning
23:52 JustinClift misc: If you have a static IP, or a not-hugely-wide range, I can add iptables rule in for you
23:53 misc JustinClift: nope, i didn't look at the backup server (yet)
23:53 JustinClift misc: You can also get to it if you come from the gerrit box
23:53 JustinClift k, no worries :)
23:53 misc but slave25.cloud.gluster.org was answering to ssh a few minutes ago and now don't
23:54 JustinClift misc: You're right, it seems to have gone bad
23:54 JustinClift Ahhh
23:54 JustinClift Ahhh
23:54 JustinClift slave25 was the one which I put the sysctl variable in for auto-rebooting on OOM
23:54 JustinClift However, in Rackspace it doesn't work
23:55 JustinClift Instead it shuts down the host
23:55 JustinClift Sorry, the VM
23:55 JustinClift I reported it to Rackspace, and they have no clue why it's not "working as intended"
23:55 JustinClift In their testing of the sysctl variable, the VM reboots. :/
23:55 JustinClift (for their VM)
23:55 misc ah ah
23:55 misc awesome
23:56 MugginsM joined #gluster
23:56 JustinClift I'll power it back on and unset that sysctl variable.  Gimme a sec
23:58 misc JustinClift: so, the slaves, they are not un-updatable like gerrit and the kenkins master, not ?
23:59 JustinClift They're supposed to be kept up-to-date
23:59 JustinClift I have yum-cron enabled in their setup script
23:59 JustinClift Though I noticed a few of them don't have it on, so may have been created before that was added
23:59 JustinClift (I enable it when I see that)
23:59 elitecoder Thanks for your help guys. It ended up just being that one folder with issues, restarted the daemons and ran a full heal and nothing has turned up in the last hour

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary