Camelia, the Perl 6 bug

IRC log for #gluster, 2013-04-09

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:05 yinyin joined #gluster
00:09 zaitcev joined #gluster
00:18 tc00per root-squash isn't listed in the Planning34 document (http://www.gluster.org/community/d​ocumentation/index.php/Planning34) nor is the bug fixing/implementing this in rpc (http://review.gluster.org/#/c/4722/) listed in the tracking bug (https://bugzilla.redhat.com/sho​wdependencytree.cgi?id=895528).
00:18 tc00per Is it safe to assume this won't make it to 3.4?
00:18 glusterbot <http://goo.gl/4yWrh> (at www.gluster.org)
00:20 JoeJulian tc00per: I would check the git log. There's features and bug fixes that weren't on the planning page.
00:23 tc00per Thanks JoeJulian. I noticed in the irclog that some bugs/fixes were 'forgotten' in previous releases. Hoping that mentioning the tracking bug AND this particular item in the irc might help ensure this fix makes it to 3.4 and not get lost along the way.
00:26 JoeJulian Hmm.. better way to do that would be to test 3.4 and report any deficiency to gluster-devel and/or file a bug report.
00:26 glusterbot http://goo.gl/UUuCq
00:42 tc00per Easy enough to do assuming I ever get an opportunity to do it... :)
00:44 tc00per left #gluster
00:49 tc00per joined #gluster
00:55 dustint_ joined #gluster
01:00 joehoyle joined #gluster
01:03 yinyin joined #gluster
01:18 robo joined #gluster
01:52 kkeithley1 joined #gluster
01:59 portante` joined #gluster
02:25 lh joined #gluster
02:55 hagarth joined #gluster
03:10 satheesh joined #gluster
03:12 lalatenduM joined #gluster
03:28 satheesh1 joined #gluster
03:30 vshankar joined #gluster
03:50 bala joined #gluster
03:56 deo__ joined #gluster
03:59 sgowda joined #gluster
04:04 bulde joined #gluster
04:05 hagarth joined #gluster
04:23 vpshastry joined #gluster
04:40 rastar joined #gluster
04:45 bharata joined #gluster
04:47 sgowda joined #gluster
04:48 15SAA98RS joined #gluster
04:48 aravindavk joined #gluster
04:54 saurabh joined #gluster
04:54 vpshastry joined #gluster
05:08 samppah does anyone know of red hat storage can be used with "normal" glusterfs client?
05:08 samppah it seems that rhs has somewhat patched version of 3.3.0
05:38 bala joined #gluster
05:46 raghu joined #gluster
05:48 spai joined #gluster
05:52 Anxuiz joined #gluster
05:58 deepakcs joined #gluster
05:58 andreask joined #gluster
06:08 deepakcs joined #gluster
06:14 mohankumar joined #gluster
06:16 mnaser joined #gluster
06:21 mohankumar joined #gluster
06:26 vimal joined #gluster
06:26 bharata joined #gluster
06:26 satheesh joined #gluster
06:33 ollivera joined #gluster
06:36 ricky-ticky joined #gluster
06:38 mnaser joined #gluster
06:42 deepakcs joined #gluster
06:44 hagarth joined #gluster
06:46 brunoleon_ joined #gluster
06:55 satheesh1 joined #gluster
06:55 jkroon joined #gluster
06:55 ekuric joined #gluster
06:56 satheesh joined #gluster
06:57 magnus^^p joined #gluster
06:57 ctria joined #gluster
07:00 rastar joined #gluster
07:01 satheesh joined #gluster
07:16 hybrid5121 joined #gluster
07:19 tjikkun_work joined #gluster
07:24 alex88 joined #gluster
07:33 andreask joined #gluster
07:34 hagarth joined #gluster
07:39 ProT-0-TypE joined #gluster
07:46 ngoswami joined #gluster
07:51 ProT-0-TypE joined #gluster
07:59 dobber_ joined #gluster
08:06 satheesh1 joined #gluster
08:11 17WABEMKU joined #gluster
08:11 mgebbe__ joined #gluster
08:12 mgebbe joined #gluster
08:14 jkroon joined #gluster
08:19 rotbeard joined #gluster
08:27 ingard__ does anyone know if i need a separat "option auth.login.<brick name>.allow <username>" line per brick or if I can use a wildcard? or even set default options somehow?
08:47 puebele joined #gluster
08:51 bharata joined #gluster
09:07 jkroon hi, given that a cluster (version 3.3) has been set up with replica=2, is it possible to update that to replica=3?
09:08 Norky joined #gluster
09:09 Norky ps xfa
09:09 ujjain joined #gluster
09:09 Norky hmm, wrong window
09:09 Norky good morning #gluster
09:10 Norky hah! I didn't get a ticking off from glusterbot - it doesn't recognise "good morning" as a greeting :)
09:11 Norky I have a largish glusterfs volume (22TB, 300,000 files) from which I have removed some bricks
09:12 Norky after the removal process ran (over a day or two) there were a great many (83,000) files shown as failed, and these remain on the bricks I want to remove. Why and what should I do about it?
09:13 Norky I have actually commited the change and I'm resyncing the files from the now defunct bricks to the mounted volume
09:14 Norky should I have rerun the remove-brick process again?
09:15 Chiku|dc hi I have 2 replicated servers, after I turn off 1 server, and turn on... I have missing on this server few minutes after, server 2 send missing on server 1. that's good
09:16 Chiku|dc but there a 1 file which not same size, and this one need to heal
09:16 Chiku|dc why doesn't it auto heal it ?
09:17 Chiku|dc and all clients use server2 and not more server 1
09:19 duerF joined #gluster
09:21 sgowda joined #gluster
09:26 manik joined #gluster
09:30 mgebbe_ joined #gluster
09:34 rastar joined #gluster
09:36 vpshastry1 joined #gluster
09:39 jkroon joined #gluster
09:46 rastar joined #gluster
09:55 dobber joined #gluster
09:58 glusterbot New news from newglusterbugs: [Bug 949914] [Feature] GFID based auxillary mount point support <http://goo.gl/zFDC4> || [Bug 949916] [Feature] GFID based auxillary mount point support <http://goo.gl/xLal4> || [Bug 949917] [Feature] GFID based auxillary mount point support <http://goo.gl/Asznu>
10:10 17WABEHEM left #gluster
10:11 Chiku|dc I have 1 file not healed
10:12 Chiku|dc heal-failed no entry
10:12 Chiku|dc and I can heal this file :(
10:16 hagarth joined #gluster
10:20 jkroon joined #gluster
10:29 sgowda joined #gluster
10:33 joe joined #gluster
10:43 rotbeard joined #gluster
10:43 Chiku|dc someone can help me about 1 healing file which not really works
10:44 Norky I think the channel si a bit quiet as most people are in bed
10:44 Norky I don't know enough to help you with that I'm afraid
10:45 Chiku|dc did you already try crashed tests and see what happened ?
10:45 Chiku|dc I'm doing this now
10:46 Ergo^ joined #gluster
10:47 Bonaparte joined #gluster
10:50 Bonaparte I added a new brick to a volume and suddenly, the web server, nginx started hanging
10:50 Bonaparte The logs in volumes and glustersh don't show anything unusual
10:50 Bonaparte After stopping the gluster daemons, nginx started working properly
10:51 Bonaparte Any suggestions on how can I go about tracing and fixing this?
10:53 Bonaparte Even listing files using ls command, takes a long time to show the output
10:54 Bonaparte nginx tries to read these files and whole website goes down because of this
10:58 Chiku|dc glusterfs client or nfs client ?
10:58 Bonaparte glusterfs
10:58 Chiku|dc does you your glusterfs are ok ?
10:59 Bonaparte Chiku|dc, sorry?
10:59 Chiku|dc do your glusterfs server are ok ?
10:59 Chiku|dc and the volume
10:59 jkroon hi, given that a cluster (version 3.3) has been set up with replica=2, is it possible to update that to replica=3?
11:00 Bonaparte Chiku|dc, yes,the volumes are okay
11:01 Chiku|dc there alot small files on the volume ? since ls took long time
11:01 Chiku|dc ?
11:01 Bonaparte Chiku|dc, yes, there are a lot of small files.
11:01 arusso_znc joined #gluster
11:01 Chiku|dc replicated ?
11:01 Bonaparte yes
11:01 Bonaparte replicated
11:02 Chiku|dc it is 1st time you got this problem ? before that ls was fast ?
11:03 Bonaparte Chiku|dc, yes, before ls was fast. This is the first time happening
11:05 Chiku|dc I don't know... I'm new too. I'm testing some crashed use cases and how healing
11:07 Norky jkroon, I think so, yes
11:08 H__ JoeJulian: yes ext4. My production environment is now 3.3.1 to get replace-brick, but it locks up the source brick with IO for 20 minutes causing outage on the entire volume , then stops and causes the destination brick daemon to busy-wait at 100%. See also my mail to -users
11:09 Norky jkroon: gluster volume add-brick VOLNAME replica 3 SERVER:/BRICKPATH
11:14 jkroon Norky, without adding another brick?
11:14 Norky well... no
11:15 jkroon ok, i'll remove a brick and re-add it
11:15 Norky to where do you expect GlusterFS to make a third copy of all your files?
11:16 Norky how many bricks do you have atm?
11:17 jkroon 4
11:17 jkroon i'll increase to 3 on re-add
11:17 hagarth joined #gluster
11:17 jkroon so removing one brick should not cause data loss
11:17 jkroon and there is sufficient space available.
11:17 Norky err, I think you'll need 2 more bricks
11:18 Norky I'm guessing you have a standard distributed-replicated set up of 4 bricks
11:19 jkroon uhrm, very new to gluster, so yes, probably
11:19 Norky so brick A is a replica of brick B, and brick C is replicated with brick D
11:19 jkroon yes, type is Distributed-Replicate, in 2x2
11:19 Norky making "sub-volumes" 1 and 2
11:19 Norky files are distributed between 1 and 2
11:20 jkroon ok, i'll inform the client that at the moment we're stuck on 2x2 if he wants higher redundancy then we need to increase the number of servers in the cluster.
11:20 Norky to increase the amount of replication, you will need to add two more bricks
11:20 Norky how many servers do you currently have?
11:20 satheesh joined #gluster
11:21 Norky you  can have 3x replication and distribution with only 3 servers (2 bricks per server), so long as you don't ask gluster to replicate between bricks on the same server (it will wanr you if you try to do that)
11:22 jkroon Norky, 4 servers.
11:22 jkroon hmm, that's an idea perhaps
11:23 jkroon so 4 x 2 == 8, doesn't work, 4 x 3 = 12, which can divide by three, so that could work.
11:23 Norky there are a couple of rules you need to follow when increasing the size (number of bricks) of distributed/replicated volume - it helps to understand the topology
11:23 Norky you could even do 3x6, leaving one server out
11:24 Norky I think you could have a "linked-list" topology with 2 bricks on each server
11:24 jkroon or 3x1 then :)
11:24 Norky ,,(linked list)
11:24 glusterbot Norky: Error: No factoid matches that key.
11:24 Norky ,,(linked)
11:24 glusterbot Norky: Error: No factoid matches that key.
11:24 Norky bother you glusterbot
11:25 Norky http://pthree.org/2013/01/25/g​lusterfs-linked-list-topology/
11:25 glusterbot <http://goo.gl/0HHCK> (at pthree.org)
11:26 Norky run https://github.com/fvzwieten/lsgvt on one of the servers for an understanding of your current topology
11:26 glusterbot Title: fvzwieten/lsgvt · GitHub (at github.com)
11:36 bcc joined #gluster
11:52 Chiku|dc hum with 3 servers, why 2x3 instead of 1x3 ?
11:52 Chiku|dc why having 2 bricks per server ?
11:58 Nevan joined #gluster
12:01 vpshastry joined #gluster
12:11 bcc left #gluster
12:12 kkeithley1 joined #gluster
12:12 hybrid512 joined #gluster
12:20 red_solar joined #gluster
12:20 andreask joined #gluster
12:21 cw joined #gluster
12:22 bennyturns joined #gluster
12:23 cw Hi. it seems like my gluster setup has gone split-brain on me big time ( https://gist.github.com/Jippi/7c1d9e7c9faa28a06c31 )
12:23 glusterbot <http://goo.gl/7Aen6> (at gist.github.com)
12:23 flrichar joined #gluster
12:23 cw whats the best course of action for fixing this ? googling around seem to provide single-case file resolutions, but considering the huge amount of files broken in my setup I'm a bit worried :)
12:24 rotbeard joined #gluster
12:25 cw I'm running 3.3.1-1 on debian :)
12:26 cw simply running gluster volume heal www full  doesn't seem to fix it
12:27 cw running heal info for the volume gives me: https://gist.github.com/Jippi/eff35d12ff15d21bb96b
12:27 glusterbot <http://goo.gl/lqZ3E> (at gist.github.com)
12:29 manik joined #gluster
12:32 ndevos cw: you still need to follow the resolution for a ,,(split-brain), but maybe you are lucky and can figure out if a certain server or brick has the 'old' contents, you could script it then
12:32 glusterbot cw: (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
12:33 cw I don't understand how splits can even happen, I got two gluster servers, in replica setup
12:33 cw and 15 webservers reading/writing
12:33 cw no fancy patnsy or high traffic things is going on
12:34 cw and both gluster machines has been running for months without issue
12:34 cw ndevos: it worries me that "heal info" shows / as being split brain too ( https://gist.github.com/Jippi/eff35d1​2ff15d21bb96b#file-gistfile1-txt-L63 )
12:34 glusterbot <http://goo.gl/WFnIR> (at gist.github.com)
12:36 ndevos well, attributes on / (which is the root of the brick) can have been modified on each gluster server
12:37 cw don't see how, there is no automated scripts touching anything in / - the clients dont even have access to /
12:37 ndevos network issues are a common cause for split-brain scenarios, like, when one webserver can reach one gluster server, and an other webserver only the other gluster server
12:37 cw does it mean I need to fully resync everything when / is split ?
12:38 ndevos no, the entries in that list are affected, if it is a directory, than it could be any 'stat' details or xattrs
12:39 ndevos not the contents of that directory, but rather the directory entry itself
12:39 cw how do I force a resync of those details / xattrs ? the documentation seems a bit thin on that
12:39 cw would be nice with a "hey, server 1 is the good one, if conflict, just trash the remote settings and use server1"
12:40 ndevos that it the part (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX
12:40 glusterbot Title: Fixing split-brain with GlusterFS 3.3 (at goo.gl)
12:41 ndevos you can safely delete the directory contents, there are hardlinks for each file under the .glusterfs directory, the contents is preserved, just the filename is not, recovery is very fast that way
12:42 cw so, if I delete, say /export/www/schweppes-momnts/public/images/pre​view/516208cf/aa64/417d/b381/62642a72c762.png   it will just re-sync the file?
12:42 cw from "/export/www/.glusterfs/a8/e5/a8e5​f4d7-9fc4-4624-ad9a-d9c12cb1a26b"
12:43 ndevos yes, basically thats the idea, but not resync, it is like a hardlink created with 'ln'
12:43 joe joined #gluster
12:43 cw ok, I thought it would be dangerous to play around inside the export - and you should only do things though a mount point
12:44 ndevos which means, the contents is not copied, but the data pointed to by /export/www/.glusterfs/a8/e5/a8e5​f4d7-9fc4-4624-ad9a-d9c12cb1a26b gets a new pointer /export/www/schweppes-momnts/public/images/pre​view/516208cf/aa64/417d/b381/62642a72c762.png)
12:46 ndevos yes, if you mess up the .glusterfs directory, you can get in real trouble
12:47 ndevos that directory is hidden when mounting, so it is safe for users
12:47 cw running "gluster volume heal www info"   shows a lot of weird "<gfid:ee1e4487-ccee-43f6-bcfe-9a12a1c2112f>" where I asume a file name should have been
12:48 ndevos also, when you create files/directories/stuff inside the brick without going through a mount, it will miss the entry in .glusterfs/ and the required xattrs are not set either
12:48 cw yeah, never touched the /export folder directly
12:49 ndevos those gfid entries are likely for files/directories that are missing (outside the .glusterfs directory)
12:50 cw outside ? :o how would that be possible
12:50 nueces joined #gluster
12:51 ndevos like /export/www/schweppes-momnts/public/images/pre​view/516208cf/aa64/417d/b381/62642a72c762.png is missing, but /export/www/.glusterfs/a8/e5/a8e5​f4d7-9fc4-4624-ad9a-d9c12cb1a26b exists
12:52 cw hmm ok
12:52 cw so thats deleted files not cleared from glusters own internal knowledge of the FS?
12:53 andreask joined #gluster
12:54 ndevos could be, or a split-brain directory prevents the actual filename from being created (and maybe other causes)
12:54 cw impossible to resolve a split on that when you go no idea what those gfids represent
12:55 cw ndevos: can I buy an hour of your time to help me? :)
12:56 ndevos sometimes the other servers have the actual filename for such a gfid entry, and resolving split-brain directories often makes healing others easier
12:56 ndevos haha, well if you have Red Hat Storage subscriptions you could call support and speak to one of my colleagues ;)
12:57 cw true that
12:57 cw I've resigned my job because of limitations like that at the company, so I agree
12:58 cw though I'm not the man with the $
12:58 glusterbot New news from newglusterbugs: [Bug 918917] 3.4 Beta1 Tracker <http://goo.gl/xL9yF>
12:59 hagarth joined #gluster
12:59 ndevos yeah, I understand, thats also one of the reasons for doing some community 'support' as a ,,(volunteer)
12:59 glusterbot A person who voluntarily undertakes or expresses a willingness to undertake a service: as one who renders a service or takes part in a transaction while having no legal concern or interest or receiving valuable consideration.
13:00 cw usually I'm able to bribe people to help for an hour or so with beer and paypals :)
13:00 cw I've resolved splits before, but this time it seems to be a whole lot more wacked than the other simple one or two files
13:01 cw doesn't feel comfortable playing around in .gluster alone :)
13:01 ndevos oh, I think there are some in this channel that do that occasionally, but I doubt they work for the same company
13:01 ndevos ... as me
13:02 puebele1 joined #gluster
13:03 cw yeah, noticed you were from redhat, which is why I hoped you could help :)
13:04 cw down to 52 issues now, all of them seem to be on gluster01
13:04 cw gluster02 reports "all ok"
13:06 robo joined #gluster
13:07 cw ndevos: whats the price for a service subscription?
13:07 cw can't find them online
13:08 ndevos hmm, I would not know, but www.redhat.com/products/storage-server/on-premise/ contains a sales link :-/
13:09 jruggiero joined #gluster
13:09 jruggiero left #gluster
13:09 cw doesn't sound like a quick fix :|
13:09 ndevos nah, not really
13:09 cw apparently a client is about to launch a big ass campaign now and this is holding them back
13:09 cw one of the biggest media budgets in our country is on hold because of this *sad face*
13:09 ndevos but, maybe you're lucky and in those 52 errors contain a directory?
13:10 ndevos sounds like you are a suitable candidate to become a RHS customer ;)
13:10 cw yeah, doesn't solve my short term issue now though :)
13:11 ndevos no, but 52 entries left is managable, I think+hope
13:11 cw not when you got no idea what to do with them :P
13:16 dustint joined #gluster
13:18 ndevos well, you could check if those gfid entries are on all the replicas
13:18 ndevos under the .glusterfs directory I mean
13:19 cw and if they aren't ?
13:19 ndevos they might have been deleted from the filesystem
13:19 cw http://joejulian.name/blog/fixin​g-split-brain-with-glusterfs-33/  said to wipe it on the bad server, and then stat through the mount point - didn't work for me though
13:19 glusterbot <http://goo.gl/FPFUX> (at joejulian.name)
13:19 ndevos what did you wipe, and what did you stat?
13:19 gbrand_ joined #gluster
13:20 cw I wiped what his script said, and stat the file $SBFILE through the mount point
13:20 rcheleguini joined #gluster
13:21 ndevos and on a good replica, that $SBFILE exists on the brick?
13:21 cw yes
13:23 ndevos maybe some cashes are in place? killing the glusterfsd for the brick and 'gluster volume start MYVOL force' to start it again?
13:23 ndevos caches even
13:24 ndevos althoug, I think a stat would flush the cache too
13:28 cw ndevos: I've managed to reduce it to this now: https://gist.github.com/Jippi/3905dfe6c9bdd40f3b49
13:28 glusterbot <http://goo.gl/YuFMn> (at gist.github.com)
13:28 glusterbot New news from newglusterbugs: [Bug 950006] replace-brick activity dies, destination glusterfs spins at 100% CPU forever <http://goo.gl/BkKJE> || [Bug 950024] replace-brick immediately saturates IO on source brick causing the entire volume to be unavailable, then dies <http://goo.gl/RBGOS>
13:28 cw *updated* ^  to include full output
13:29 ndevos cw: it may also well be that the list gets smaller with subsequent heals (if there are directories included)
13:30 cw perhaps, I've managed to fix all those where files are actually outputted
13:30 cw all of those were things we generate on the fly anyway, so just wiped them
13:31 johnmark cw: right now, Red Hat is the only company that offers support for GlusterFS (Red Hat Storage) - although that may change going forward
13:32 cw johnmark: aye, ndevos is very helpful right now :) would be nice if redhat had a big red panic button subscription :)
13:32 jclift_ joined #gluster
13:34 cw ndevos: eh, https://gist.github.com/Jippi/5bd4f65​e39cce2e7b2e5#file-gistfile1-txt-L19      0 x 4 = 2 ?
13:34 glusterbot <http://goo.gl/C1mq0> (at gist.github.com)
13:34 cw thats some messed up math?
13:35 johnmark cw: heh heh :)
13:35 ndevos cw: actually, I've seen that before... maybe I can find it again
13:39 cw ndevos: alright
13:39 ndevos cw: those counters are in the /var/lib/glusterd/vols/MYVOL/info file, you can modify those counters when the volume is taken offline - I dont have much more details than that
13:40 cw ok, just wondering how 2 servers, in replica, with a brick each can end up as 0 x 4
13:41 ndevos oh, and glusterd should probably not be running when you modify that info file
13:41 cw at most it should be 0 x 2 :P
13:41 cw sub_count=4     doesn't tell me much about how it ended up with 4 of anything :)
13:42 ndevos yeah, thats what I could not figure out the last time, and I could not reproduce it either
13:42 cw so I can change it back to 2 in sub_count ?
13:42 piotrektt joined #gluster
13:46 cw ndevos: is it even possible to reverse a "file" like <gfid:944380ed-bec1-4a81-970b-961c46d19d6a> back into the path it's supposed to be?
13:48 bennyturns joined #gluster
13:49 lalatenduM joined #gluster
13:51 ndevos cw: the gfid entry in the .glusterfs directory is a hardlink for files, a symlink for directories
13:52 ndevos so, if the gfid 944380ed-bec1-4a81-970b-961c46d19d6a under the .glusterfs directory is a symlink, 'readlink' would show the name of the directory
13:52 cw let me try that :)
13:53 manik joined #gluster
13:53 ndevos if it is not a symlink, the inode of the gfid 'file' is the same as the inode of the actual filename
13:53 cw hmm, well, thats "annoying".. restarted both servers and no more split brain
13:54 vpshastry joined #gluster
13:54 ndevos oh, interesting, that is after running a volume heal command?
13:55 semiosis @latest
13:55 glusterbot semiosis: The latest version is available at http://goo.gl/zO0Fa . There is a .repo file for yum or see @ppa for ubuntu.
13:56 dustint joined #gluster
13:57 cw ndevos: i gave up trying to figure out those gfid's things, so just entered reboot angrily and it fixed itself
13:57 cw does self-heal in gluster always change user/group to root / root ?
13:57 cw noticed things that selfheal don't keep their original id of www-data / www-data
13:58 ndevos cw: well, I'm glad its resolved now
13:58 ndevos self-heal is started as root for all I know, I very much doubt it would function otherwise
13:59 cw ok, that explains why some of our files pop up as root/root all of the sudden when they should be www-data / www-data
13:59 cw scary :O
13:59 cw when running exposed on web
13:59 ndevos or, maybe you mean self-heal changes the permissions of some files/dirs? that would be unexpected
14:00 ndevos /var/log/glusterfs/glustershd.log may have some details
14:01 vshankar joined #gluster
14:02 cw ndevos: will look into it once the dust settles
14:02 cw ndevos: thank you so much for helping :)
14:03 ndevos you're welcome :)
14:03 satheesh joined #gluster
14:10 H__ heh, this -> gluster volume rebalance vol01 fix-layout status
14:10 H__ gives : Usage: volume rebalance <VOLNAME> [fix-layout] {start|stop|status} [force]
14:10 H__ which matches afaik. New bug ?
14:13 semiosis status probably doesnt accept fix-layout.  just guessing.
14:14 semiosis intuitively fix-layout seems to only apply to the start operation
14:15 H__ when i remove fix-layout all status counters are 0. Any ideas how to find when I can continue with migrate-data ?
14:18 daMaestro joined #gluster
14:23 satheesh1 joined #gluster
14:24 chirino hi everyone… if I do a gluster volume set all cluster.server-quorum-ratio 50%
14:25 chirino and take down 2 out of the 3 bricks in my vol.  shouldn't reads fail?
14:27 cw isn't that a 3.4 thing?
14:29 glusterbot New news from newglusterbugs: [Bug 950048] [RHEV-RHS]: "gluster volume sync" command not working as expected <http://goo.gl/ZO2Gy>
14:33 semiosis chirino: is vs. ought.  afaik, loss of quorum only prevents writes, not reads.
14:33 semiosis whether reads *should* fail is another question
14:34 chirino semiosis: thought cluster.server-quorum-ratio was supposed to take down brinks if there is quorum.
14:34 chirino assumed if there were no bricks at all, the perhaps reads should fail.
14:35 chirino s/ if there is quorum./ if there is no quorum./
14:35 glusterbot What chirino meant to say was: semiosis: thought cluster.server-quorum-ratio was supposed to take down brinks if there is no quorum.
14:35 chirino @glusterbot Your a mind reader!
14:37 semiosis chirino: could you ,,(pasteinfo) about this volume?
14:37 glusterbot chirino: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
14:38 lh joined #gluster
14:38 lh joined #gluster
14:38 chirino http://pastie.org/7386720
14:38 glusterbot Title: #7386720 - Pastie (at pastie.org)
14:39 semiosis how are you 'taking down' bricks?
14:39 chirino http://dpaste.org/hFo2O/
14:39 glusterbot Title: dpaste.de: Snippet #224027 (at dpaste.org)
14:40 chirino service glusterd stop
14:40 chirino on FC 17
14:40 semiosis does that kill the glusterfsd ,,(processes) as well?
14:40 glusterbot the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more information.
14:41 chirino yes
14:41 chirino non are running.
14:41 chirino none even
14:42 semiosis i'll try to reproduce this
14:44 chirino semiosis: this is the script I use to test: http://dpaste.org/BXdrY/
14:44 glusterbot Title: dpaste.de: Snippet #224028 (at dpaste.org)
14:44 chirino semiosis: so you are expecting the reads to fail?
14:45 semiosis well, if all bricks are stopped, then reads must/will fail.  if reads aren't failing, then some replica must still be online
14:45 semiosis so going to run through this and see what i can see
14:46 chirino wonder if FC 17's start scripts are just restarting the brick.
14:46 chirino doesn't FC 17 have a watchdog type thingy for services.
14:46 chirino ?
14:47 semiosis idk
14:47 chirino semiosis: what linux you on?
14:47 semiosis ubuntu
14:47 chirino I'm using these rpms BTW: glusterfs-3.4.0alpha2-1.fc17.i686
14:47 duerF joined #gluster
14:47 JoeJulian "service glusterd stop" on Fedora 17 does not stop glusterfsd.
14:47 semiosis JoeJulian: thanks that seemed fishy to me
14:48 chirino JoeJulian: explain this then: http://dpaste.org/y5H5a/
14:48 glusterbot Title: dpaste.de: Snippet #224030 (at dpaste.org)
14:49 JoeJulian Well, I was typing and hit enter just as you posted the alpha version you're testing...
14:49 chirino JoeJulian: am I missing something?
14:50 chirino BTW this is true also fro 3.3.1
14:51 JoeJulian Dammit, I don't have any fedora17 left to prove that on.
14:51 JoeJulian Would you accept a fedora 18?
14:52 chirino if you install the same rpm I'm using.
14:52 chirino which is the fc17 rpm :)
14:53 JoeJulian ,,(meh) I'm not that interested. File a bug report. Stopping glusterd should not stop the bricks.
14:53 glusterbot I'm not happy about it either
14:55 aliguori joined #gluster
14:55 lh joined #gluster
14:55 lh joined #gluster
14:55 chirino so I'm not seeing the glusterfsd being killed on node that has the remaining brick.
14:55 JoeJulian File a bug report. Stopping glusterd should not stop the bricks. All your bricks /should/ still be up.
14:56 sgowda joined #gluster
14:56 chirino JoeJulian: glusterfsd is only running on 1 out of my 3 brick servers
14:59 glusterbot New news from newglusterbugs: [Bug 895528] 3.4 Alpha Tracker <http://goo.gl/hZmy9>
15:09 semiosis chirino: i suspect you need to enable server-quorum.  run 'gluster volume set help' and look at the info on option cluster.server-quorum-type near the bottom.  perhaps if you set that to 'server' it will work as expected?
15:09 semiosis i'm still working on reproducing but that jumped out at me along the way
15:10 JoeJulian I still don't know why quorum would ever stop reads
15:10 bugs_ joined #gluster
15:10 JoeJulian Ah, yes I do...
15:11 JoeJulian Too slow. Need coffee...
15:11 semiosis JoeJulian: new feature, glusterd kills surviving bricks if theres not enough replicas to make a quorum
15:11 ramkrsna joined #gluster
15:12 semiosis apparently, anyway
15:12 JoeJulian Seriously?
15:12 semiosis http://www.gluster.org/community/documen​tation/index.php/Features/Server-quorum & https://bugzilla.redhat.com/show_bug.cgi?id=839595
15:12 glusterbot <http://goo.gl/vrw2D> (at www.gluster.org)
15:12 glusterbot Bug 839595: high, unspecified, ---, pkarampu, ON_QA , Implement a server side quorum in glusterd
15:13 semiosis 3.3 had a client-side quorum, in which a client would turn read-only if it couldn't see a majority of bricks
15:13 JoeJulian Oh, my.... I'll have to look at that patch. I can think of lots of ways that could be bad.
15:13 semiosis haha
15:13 semiosis fortunately either/both of these quorum types are optional
15:14 JoeJulian Yes, but... All servers are down. Start them up. First one up sees that there's no quorum and goes down. Next one up sees that there's no quorum and goes down.... etc...
15:15 JoeJulian I'm sure they must have thought about that though.
15:16 mriv joined #gluster
15:16 semiosis glusterd/glusterfsd distinction should handle that.  glusterd is up, waits to see other glusterds are up before starting glusterfsds
15:17 semiosis i think of it like automatic volume start/stop based on having a majority of glusterds running
15:17 semiosis no circular dependency there
15:17 semiosis get some coffee
15:20 JoeJulian so far that's not what I'm reading...
15:20 semiosis chirino: i can't even set server-quorum-ratio! http://pastie.org/7387446
15:20 glusterbot Title: #7387446 - Pastie (at pastie.org)
15:21 ninkotech joined #gluster
15:21 * semiosis tries with three bricks
15:21 ninkotech__ joined #gluster
15:26 JoeJulian Ah, I see. 839595 only refers to glusterd quorum, not volume quorum.
15:28 semiosis bug 839595
15:28 glusterbot Bug http://goo.gl/ZEu0U high, unspecified, ---, pkarampu, ON_QA , Implement a server side quorum in glusterd
15:30 andrewbogott left #gluster
15:31 Chiku|dc I read this in statedump inodelk.inodelk[9](BLOCKED)=type=WRITE
15:31 Chiku|dc I have 1 file which can't be headed
15:31 manik joined #gluster
15:31 Chiku|dc how can I heal it
15:32 Chiku|dc replicated servers
15:33 semiosis ,,(self heal)
15:33 glusterbot I do not know about 'self heal', but I do know about these similar topics: 'targeted self heal'
15:33 semiosis ,,(split brain)
15:33 glusterbot I do not know about 'split brain', but I do know about these similar topics: 'split-brain'
15:33 semiosis ,,(split-brain)
15:33 glusterbot (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
15:33 semiosis Chiku|dc: #2 ^
15:35 Chiku|dc semiosis, but if I do volume heal myvol info split-brain. I have 0
15:35 semiosis idk what to say, hoped that article would help
15:35 chirino semiosis, use 'all'
15:35 chirino instead of a volume.
15:35 chirino that worked for me.
15:36 semiosis ooh good to know
15:36 chirino gluster volume set all cluster.server-quorum-ratio 50%
15:36 semiosis i already deleted my volumes though, i was going to spin up a 3rd vm but i need to get back to work before this eats my whole day
15:36 semiosis will try again later this afternoon or tmrw
15:37 semiosis in the mean time, you should try setting server-quorum-type to 'server'
15:37 semiosis that might make the difference
15:37 semiosis seems like you're setting the ration option on a feature that's not enabled
15:37 semiosis s/ration/ratio/
15:37 glusterbot What semiosis meant to say was: seems like you're setting the ratio option on a feature that's not enabled
15:38 Chiku|dc oh the file sounds like split-brain
15:38 Chiku|dc but if I read this http://www.joejulian.name/blog/fix​ing-split-brain-with-glusterfs-33/
15:38 chirino semiosis: whoa.. yay.. tried your idea of setting it to 51% and the test works.
15:38 glusterbot <http://goo.gl/FzjC6> (at www.joejulian.name)
15:39 Chiku|dc I don't find same display
15:39 semiosis \o/
15:41 Chiku|dc when I do this command I have Number of entries: 0
15:41 DataBeaver joined #gluster
15:42 Chiku|dc but glusterfs try to heal the file each 10 minutes, and doesn't even mark this file as heal-failed
15:42 m0zes joined #gluster
15:47 hybrid5121 joined #gluster
15:48 manik joined #gluster
15:48 hybrid5121 joined #gluster
15:55 chirino so does cluster.server-quorum-ratio setting refer to the # of bricks in a volume, or the # of peers in the cluster?
15:57 semiosis as i understand it, the number of peers (online vs total)
15:59 rastar joined #gluster
16:01 chirino semiosis: seems a little odd in relation to volumes then.
16:02 semiosis hence the "volume set all"
16:02 jag3773 joined #gluster
16:02 semiosis it does feel kludgy
16:02 chirino I could have 100 peers in a cluster, but only 3 bricks in a volume.
16:03 chirino If I loose 2 of the bricks, I was hoping to use the server-quorum-ratio to take down the 3rd brick.
16:04 semiosis it is possible for gluster to do the right thing and evaluate the quorum ratio wisely with regard to bricks in volumes
16:04 semiosis this is all very new stuff though, not sure how it works
16:04 premera joined #gluster
16:04 semiosis seems like it should do what you want
16:06 chirino so in my test I was running 4 nodes. 3 /w bricks, 1 running as a pure client.
16:06 chirino I got a feeling that the 51% setting wont work anymore if I increase it to 5 nodes.
16:08 mnaser joined #gluster
16:14 mgebbe_ joined #gluster
16:17 bala joined #gluster
16:22 bulde joined #gluster
16:32 zaitcev joined #gluster
16:38 Anxuiz Is there a way to change the IP address that gluster connects to for bricks without having the copy the bricks?
16:38 Anxuiz Right now I have each node talking to each other on the public internet IP addresses and I want to change them to use the vlan
16:40 Supermathie Anxuiz: I stopped gluster and edited the files by hand. Changed the peer name to be a special name in /etc/hosts (or you could use use IP)
16:40 Mo___ joined #gluster
16:41 semiosis Anxuiz: split horizon dns
16:43 Supermathie Anxuiz: Actually, I was initially goin to change the peer name, but things went weird. I put it back to the proper name and just made sure there was an entry in /etc/hosts for the other server corresponding to the correct IP (split horizon) :)
16:44 Anxuiz Ah, I will do that. Is there no way to change the peer hostname without copying?
16:45 Supermathie You don't need to copy anything
16:45 Supermathie Oh wait, change the peer hostnameâ... you may just be able to edit all the files.
16:45 Supermathie It didn't work for me, maybe I missed something
16:46 Supermathie Or maybe that was the wrong way to go about it :)
16:46 sjoeboo joined #gluster
16:47 semiosis i use ,,(hostnames) and dedicate a hostname to gluster for each server... gluster1.my.domain.net for example, so i can have flexibility
16:47 glusterbot Hostnames can be used instead of IPs for server (peer) addresses. To update an existing peer's address from IP to hostname, just probe it by name from any other peer. When creating a new pool, probe all other servers by name from the first, then probe the first by name from just one of the others.
16:49 Anxuiz Sounds good. Thanks for the help :)
16:49 semiosis yw good luck
16:51 raghu joined #gluster
16:56 lalatenduM joined #gluster
16:59 andreask joined #gluster
17:04 robos joined #gluster
17:07 ProT-0-TypE joined #gluster
17:08 gbrand_ joined #gluster
17:19 ramkrsna joined #gluster
17:19 ramkrsna joined #gluster
17:29 glusterbot New news from newglusterbugs: [Bug 950121] gluster doesn't like Oracle's FSINFO RPC call <http://goo.gl/H1BrV>
17:33 jkroon joined #gluster
17:33 cw joined #gluster
17:43 jskinner_ joined #gluster
17:44 joehoyle joined #gluster
18:32 ProT-0-TypE joined #gluster
18:43 jskinn___ joined #gluster
18:45 roo9 https://gist.github.com/AdamJa​cobMuller/48a03f7cef46b2f10719
18:45 glusterbot <http://goo.gl/O90ak> (at gist.github.com)
18:45 roo9 anyone seen anything like this?
18:54 JoeJulian roo9: post the client log around that test.
18:55 robo joined #gluster
18:57 roo9 JoeJulian: I have some older errors like "remote operation failed: No such file or directory"
18:57 roo9 the test directory exists on all nodes though
18:58 JoeJulian Well, it does now, but did it when it posted that error?
18:58 ctria joined #gluster
18:58 roo9 possibly not, but would that somehow corrupt the internals of gluster?
18:58 JoeJulian Oh, yeah.
18:59 JoeJulian The directories are all supposed to exist on all bricks.
18:59 Supermathie roo9: Yes, I have:
18:59 roo9 weird, what could cause it to not be created
19:00 Supermathie roo9: http://i.imgur.com/JU8AFrt.png
19:00 glusterbot New news from newglusterbugs: [Bug 874498] execstack shows that the stack is executable for some of the libraries <http://goo.gl/NfsDK>
19:00 JoeJulian Not sure. There's probably evidence in one or more client logs.
19:00 Supermathie roo9: Your first ls -al makes it look like you're inside a directory that's been deleted.
19:00 JoeJulian Oh, right, good point Supermathie. I'm also basing my thought process on the fuse client.
19:00 JoeJulian If you're mounting via nfs, I don't know what fscache does to things.
19:01 roo9 using the fuse client
19:02 Supermathie OK, not same problem as me then. Didn't think so :p
19:02 roo9 hrm, by that theory, restarting/remounting on the client should fix it then?
19:02 roo9 (it does not)
19:03 JoeJulian If there's no clue in the client log, then I'd check the brick logs.
19:03 roo9 "/test: failed to get the gfid from dict"
19:04 JoeJulian I'd make sure all the bricks are up. Make sure the client's connected to them. I'd look in the logs for " E " then maybe look above that as well to see what led up to it.
19:05 JoeJulian I also wouldn't edit error lines to just give what you think is important. :P I often open the source around an error message to see what the source says the reason is. Plus, I can't remember if that's a Warning, Info or Error.
19:06 roo9 sure, sorry, i was trying to make it briefer to put into chat
19:06 JoeJulian I understand.
19:06 roo9 [2013-04-09 15:02:14.935672] E [afr-self-heal-common.c:2160:​afr_self_heal_completion_cbk] 0-volume-replicate-3: background entry self-heal failed on /
19:07 roo9 that seems persisent, occurs relatively frequently
19:07 JoeJulian Ah. That one.
19:08 JoeJulian I think that's... (waiting on bugzilla....)
19:08 JoeJulian bug 859581
19:08 vincent_vdk joined #gluster
19:08 glusterbot Bug http://goo.gl/60bn6 high, unspecified, ---, vsomyaju, ASSIGNED , self-heal process can sometimes create directories instead of symlinks for the root gfid file in .glusterfs
19:09 JoeJulian Check .glusterfs/00/00/00000000-0​000-0000-0000-000000000001 on all your bricks. If one of them is NOT a symlink, add the results of "getfattr -m . -d -e hex .glusterfs/00/00/00000000-0​000-0000-0000-000000000001" to that bug report.
19:09 joehoyle joined #gluster
19:10 roo9 hrm, all symlinks to ../../..
19:10 JoeJulian rats.
19:12 JoeJulian Ok, then on the bricks belonging to 0-volume-replicate-3 (they start numbering at 0) look in the logs for something to do with /
19:12 JoeJulian Er, also make sure the client's got open tcp connections to those bricks.
19:13 roo9 so the 4th brick?
19:13 JoeJulian No, the 4th replica
19:14 JoeJulian If it's replica 2 then the 7th and 8th bricks.
19:15 roo9 hrm, no errors in logs on either brick
19:15 roo9 just a connect/disconnect that occurred during that timeframe (i remounted the client)
19:20 roo9 all bricks up/connected as well
19:36 jskinner_ joined #gluster
19:42 jdarcy joined #gluster
19:49 joehoyle joined #gluster
20:05 ladd joined #gluster
20:10 andreask joined #gluster
20:19 lh joined #gluster
20:22 __Bryan__ joined #gluster
20:38 duerF joined #gluster
20:49 brunoleon__ joined #gluster
20:52 Hydrazine joined #gluster
21:04 georges joined #gluster
21:21 zaitcev joined #gluster
21:25 jkroon joined #gluster
21:27 furkaboo joined #gluster
21:28 Zengineer joined #gluster
21:28 joehoyle joined #gluster
21:28 samppah_ joined #gluster
21:28 msvbhat_ joined #gluster
21:31 Kins_ joined #gluster
21:31 RobertLaptop joined #gluster
21:38 gbrand_ joined #gluster
21:38 inodb_ joined #gluster
21:40 kkeithley joined #gluster
21:54 tc00per left #gluster
22:12 magnus^p joined #gluster
23:12 dustint joined #gluster
23:45 vex anyone have any tips/suggestions for backing up gluster volumes?
23:48 stoile joined #gluster
23:57 JoeJulian I just use rsync from client mounts.

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary