Camelia, the Perl 6 bug

IRC log for #gluster, 2012-12-27

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 fuzai joined #gluster
00:02 fuzai http://pastebin.com/Euuq7DKc <--- du -sh is not reporting the used space correctly.  This is with ubuntu 12.04.01 with gluster from the gluster repos
00:02 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
00:03 fuzai i mean df -h isn't reporting space correctly, i'm guessing that du is going to be more accurate then df
00:07 JoeJulian is "du -sh --apparent-size /storage" any closer?
00:08 fuzai yes
00:09 fuzai it reports the same as du -sh
00:09 fuzai df is wrong, not du
00:10 JoeJulian Is this in 3.3?
00:12 fuzai What ever is current in the ubuntu gluster repos ( gluster provided )
00:12 JoeJulian Run "gluster volume heal home full"
00:15 * JoeJulian waves at fuzai from Edmonds.
00:16 fuzai seems the first time it didn't like it but the 2nd time it worked
00:16 JoeJulian Looks like you're local. :D
00:16 fuzai Kinda
00:16 fuzai i'm down in the south end visiting family, but i live down in Portland now
00:16 fuzai I've got family all over the Kent area
00:17 JoeJulian Cool. I get down to Portland from time-to-time, too.
00:17 fuzai nice
00:17 fuzai df -h isn't reporting correctly still
00:17 JoeJulian How many total files do you have? (roughly)
00:19 fuzai not that many, i'm using this for home directories, and i've got one active user on it
00:20 fuzai 9800 files
00:21 fuzai 9856 is what find | wc -l comes back with
00:22 fuzai I just realized on my terminal servers that both of my gluster shares are reporting the exact same size / usage / available in df -h
00:25 kaos01 my glusterfs service fails to start running it with --debug gives me: /etc/glusterfs/glusterfsd.vol:No such file or directory
00:25 fuzai you need to create the file
00:25 fuzai so it knows the mapping of your bricks
00:26 fuzai actually ignore that
00:26 JoeJulian kaos01: That should be created on install.
00:26 kaos01 aah sh** my fault
00:27 kaos01 is there a way to generate those files ?
00:28 JoeJulian reinstall?
00:28 JoeJulian Are you using rpms?
00:28 kaos01 yes
00:28 JoeJulian yum reinstall glusterfs-server
00:28 kaos01 but i already configured volumes, etc ...
00:28 kaos01 does that matter ?
00:28 JoeJulian That's fine.
00:28 kaos01 cool
00:35 kaos01 now getting Directory '/home/export' doesn't exist, exiting
00:35 JoeJulian does it?
00:36 kaos01 no
00:36 kaos01 glusterfsd.vol: is looking for it
00:36 JoeJulian Don't start glusterfsd. Just start glusterd.
00:40 kaos01 cool
00:41 kaos01 is the gluster mount sooposed to hang when a node goes down ?
00:41 kaos01 untill im guessing a timeout is reached
00:52 JoeJulian depends on how it goes down. If it's a hard fail, yes, 42 seconds. If it's a proper shutdown, it should be able to go unnoticed.
00:52 JoeJulian Assuming a replicated volume, of course.
01:12 yinyin joined #gluster
01:22 kaos01 ok something is wrong than, maybe init scripts i do a reboot of a node and other node hangs when accessing replicated directory
01:38 kevein joined #gluster
01:41 mynameisdeleted joined #gluster
01:41 mynameisdeleted I have 8 machines using the system at any time.. most booting linux off the network off a 15TB raid6 nfs root drive
01:41 mynameisdeleted at leats 4 of these use heavy multimedia, but generallyl one machien at a time reads or writes huge files
01:42 mynameisdeleted would gluster prevent a single machine reading a huge file from slowign down other machines file access times?
01:42 mynameisdeleted my network uses infiniband and gigabit ethernet but infinband for all desktops.... which means I have no spare card slots as they also all use high end graphics cards and only have 2 fast slots per board
01:42 mynameisdeleted so I have to use the 6 built-in sata ports on a single node
01:43 mynameisdeleted if I use 3 i7 workstations to run 18 drives at 3TB or 4TB per drive will this read more files in parallel compared to just using single raid?
01:43 mynameisdeleted from waht I know linux kernel doesnt support gluster-root
01:44 mynameisdeleted but a custom initrd could do some fancy mounting to load glusterfs and use other filesystmes where needed to actually make that happen anyways
01:44 mynameisdeleted and I can stil keep most systems diskless
01:44 mynameisdeleted here are questions about gluster
01:44 mynameisdeleted 1:  will it improve max file count?
01:44 mynameisdeleted 2:  will it improve parallel performance from differnet machiens at same time?
01:45 mynameisdeleted 3:  will it improve single huge-file read rates?
01:45 mynameisdeleted 4:  will it provide most of the features I need on a root filesystem?
01:45 mynameisdeleted 5:  will it be safer than raid6?
01:46 mynameisdeleted 6:  will this be fault tollerant if I only use 2 or 3 datanodes and one goes down?
01:46 berend joined #gluster
01:49 kaos01 i know nothing about glusterfs but would single read rate still depend on underlying stage, i.e. raid , disk speed
01:49 kaos01 as woudl parallele performance ?
01:52 semiosis mynameisdeleted: assuming that your workstations are not all contending for access to the same file then you probably will get better performance using glusterfs to distribute your files over many bricks/servers
01:52 semiosis mynameisdeleted: chances are that you would get most of the benefits you seek but its hard to say for sure
01:53 semiosis re: 1-2, probably yes
01:54 semiosis re: 3, single file performance will be limited to your raw disk/network abilities, but you can spread files out over more disks/servers which may help
01:55 semiosis re: 4, many people store home dirs on glusterfs, root dirs less often but it should work -- and if it doesn't, that's a bug which could be fixed :)
01:55 semiosis re: 5, generally glusterfs can be made safer than raid* -- i for one use glusterfs to get cross-datacenter redundancy, which raid can not do
01:56 semiosis re: 6, if you use replication well you can survive server failure
01:57 semiosis it's always possible to cause ,,(split brain) with replication but you can design your arch. to make this very unlikely
01:57 glusterbot I do not know about 'split brain', but I do know about these similar topics: 'split-brain'
01:57 semiosis ,,(split-brain)
01:57 glusterbot (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
01:59 semiosis mynameisdeleted: you should also check back in 16 hours when it's business hours in most of the US, this channel is more active at that time... also it's the holiday season so there's probably less people around than usual :(
02:02 yinyin joined #gluster
02:04 mynameisdeleted yeah.. so glusterfs can use raid5/riad6 nodes and run on top fo that using raid6 for blcok devices?
02:04 mynameisdeleted 1 device per node?
02:04 mynameisdeleted or is it better to give gluster the disks directly?
02:04 mynameisdeleted lots of disks per node?
02:04 mynameisdeleted 1 3TB drive for metadata shoudl be plenty for a 30TB system?
02:05 mynameisdeleted maybe a cheaper 1TB drive for the metadata?
02:06 mynameisdeleted whats proper ratio of metadata-space to total disk space?
02:06 mynameisdeleted ohh.. gluster is not lustrefs...haha
02:06 mynameisdeleted no metadata drive needed
02:06 mynameisdeleted whats the big reasons to choose luster or gluster or gpfs
02:06 mynameisdeleted ?
02:08 semiosis metadata?
02:09 semiosis right, not luste
02:09 semiosis lustre
02:09 semiosis glusterfs is completely distributed -- no master
02:09 * semiosis loves saying that :)
02:09 semiosis i usually recommend giving glusterfs the disks directly, format them with xfs, inode size 512
02:10 semiosis imho glusterfs replication >> raid mirroring
02:10 semiosis s/>>/is much better than/
02:10 semiosis s/\>\>/is much better than/
02:10 glusterbot semiosis: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
02:10 glusterbot What semiosis meant to say was: imho glusterfs replication is much better than raid mirroring
02:11 mynameisdeleted I want complete redundancy if a power-supply failure in one systme fries all drives at once?
02:11 mynameisdeleted or if a fan failure overheats all drives in that tower
02:11 mynameisdeleted or if the motherboard fails and all drives are inacessaable or if a single system powers off with all drives in it
02:11 semiosis i concede that you may still want to use raid striping if you either need extra single thread performance or have files almost as large as or larger than individual disks
02:11 semiosis you can use glusterfs to replicate between servers
02:11 mynameisdeleted single file performance is my biggest concern
02:12 mynameisdeleted so first upgrade to 2 nodes is like raid1.. both servers replicate exactly eachother
02:12 mynameisdeleted and provide 2 maxed out reads at once for single large files
02:12 mynameisdeleted 3rd node doubles my gluster size and keeps it save froim any single machien reboot
02:12 semiosis 3rd & 4th?
02:12 mynameisdeleted yeah
02:12 mynameisdeleted I'd like to make it like raid5
02:13 semiosis you want replication right?  so you'd want to add bricks in pairs, which usually means adding servers in pairs
02:13 mynameisdeleted any file is copied to at least 2 systems but no more
02:13 semiosis no raid 5
02:13 mynameisdeleted ok.. so raid10 -like
02:13 semiosis best to avoid raid comparisons when understanding glusterfs, it just leads to confusion
02:13 semiosis imho
02:13 semiosis i guess raid10 would be the closest analogy, but it's a weak one at best
02:13 semiosis glusterfs works at the file level, raid at the block level
02:13 semiosis so they're quite different
02:14 mynameisdeleted every file is stored on 2 servers, but they  may not be saved in pairs
02:14 mynameisdeleted so if I use 3 nodes... first file mihg save on 1 and 2, while the 2nd might save on 2 and 3 and 3rd on 3 and 1?
02:14 semiosis when you create a replicated volume you provide a list of bricks, you must give a number of bricks that is a multiple of your replica count
02:14 mynameisdeleted ahh
02:14 mynameisdeleted so can have 2 or 4 nodes but not 3 if replica count is 2
02:15 semiosis well you can but it just makes things more complicated
02:15 mynameisdeleted I think lustre may allow for using 3 datanodes with replica count of 2
02:15 semiosis you can do that with glusterfs too, i just prefer not to
02:15 semiosis let me find the doc
02:15 fuzai how often should I be doing volume heals?
02:15 mynameisdeleted heal resolves split-braining issues?
02:15 mynameisdeleted where one copy differs from another in a replicated system?
02:16 semiosis that's not split brain
02:16 fuzai among other things
02:16 semiosis split brain is when heal can't figure out which of two divergent replicas is the "good" one
02:16 mynameisdeleted what do you have to do if you see that on a file?
02:16 semiosis normal inconsistency is handled automatically, when one is clearly old & the other new
02:17 mynameisdeleted is there a tool to fix this?
02:17 mynameisdeleted on a per file basis where I can choose which I want
02:17 mynameisdeleted ?
02:17 fuzai so back to my first question, if i'm using gluster as a storage place for home directories for a cluster of terminal servers, how often should I fully heal?
02:17 semiosis fuzai: since 3.3.0 there is a self-heal daemon which proactively heals stuff... pre-3.3.0 files would be healed when they were next accessed
02:17 semiosis mynameisdeleted: see ,,(split-brain)
02:17 fuzai ok but someone suggested to me that i needed to heal earlier
02:17 glusterbot mynameisdeleted: (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
02:18 fuzai glusterfs 3.3.1 built on Nov  8 2012 00:26:36
02:18 fuzai :)
02:18 semiosis mynameisdeleted: split brain should not happen, and can be avoided with good planning & failure testing
02:19 semiosis mynameisdeleted: you can also use quorum (ideally with 3-way replication) to further prevent split brain
02:19 mynameisdeleted haha... other method.. cron-daemon and unisync between nfs servers
02:20 mynameisdeleted that prob stinks compared to glusterfs
02:20 mynameisdeleted I've done that over internet before when I didnt want file access over the internet but did want 2 directories to match
02:21 semiosis there's still cases where its useful, at least until glusterfs gets real multi-master geo-replication (hopefully soon)
02:21 mynameisdeleted could mirror a folder on glusterfs to a folder on a personal laptop drive or an off-site drive
02:21 mynameisdeleted also a cron script to run unisync between machiens works with the machien fully offline for the most part
02:21 mynameisdeleted and doesnt requrie the laptop client replicate the entire filesystme.. only those folders it needs
02:30 raven-np joined #gluster
02:47 vex joined #gluster
02:54 vex so 'gluster volume heal <volume> info split-brain just shows me a lot of gfid: information
02:54 vex How do I parse that into something useful, or clean it up?
03:04 zhuyb joined #gluster
03:34 zhuyb joined #gluster
04:01 masterzen joined #gluster
04:04 vpshastry joined #gluster
04:42 hagarth joined #gluster
04:43 layer3switch joined #gluster
04:51 Humble joined #gluster
04:56 yinyin joined #gluster
05:12 kevein joined #gluster
05:19 raghu joined #gluster
05:25 bulde joined #gluster
05:31 layer3switch joined #gluster
05:34 sgowda joined #gluster
05:48 shylesh joined #gluster
05:48 shylesh_ joined #gluster
05:58 bala1 joined #gluster
06:09 ramkrsna joined #gluster
06:33 yinyin joined #gluster
06:34 glusterbot New news from resolvedglusterbugs: [Bug 825562] [glusterfs-3.3.0q43]: clear-locks attempts to connect using privileged port <http://goo.gl/7JI6O>
06:51 kikupotter joined #gluster
06:52 shireesh joined #gluster
06:53 kikupotter http://pastebin.com/467eYCYq  help me !!
06:53 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
06:55 kikupotter :(
07:07 bulde kikupotter: what is the volume name?
07:07 bulde ie, the 'gluster volume create' name
07:07 bulde the issue is you are trying to mount the 'brick', not the volume, check 'showmount -e '
07:08 kikupotter ok
07:10 vijaykumar joined #gluster
07:17 kikupotter bulde, thank you !!!
07:17 kikupotter :)
07:18 bulde kikupotter: :-)
07:19 bulde good that it worked for you now, btw, just one question, why 3.2.5 ? and not 3.3.1?
07:21 kikupotter bulde, i dont know.i just use the apt-get insall glusterfs  in ubuntu12.04 platform.
07:21 rgustafs joined #gluster
07:21 bulde ah! get it
07:21 bulde *got it
07:22 ngoswami joined #gluster
07:22 jtux joined #gluster
07:22 bulde ok, enjoy your time with glusterfs, any questions, post here... even if not many are around, it would be answered when someone sees it later... also considering this is yearend, another week, traffic may be less in this channel
07:23 rgustafs joined #gluster
07:24 kikupotter thanks !!!!!
07:36 kikupotter i want to use the glusterfs storage openstack vm data.but i dont how arch can improve HA and load banlance
07:55 kikupotter i need help ,thank everyone.
07:56 ctria joined #gluster
07:58 melanor9 joined #gluster
08:04 berend` joined #gluster
08:07 berend` joined #gluster
08:12 jtux joined #gluster
08:22 sgowda joined #gluster
08:31 tjikkun_work joined #gluster
08:44 dobber joined #gluster
08:51 vimal joined #gluster
08:53 mohankumar joined #gluster
08:56 face|less joined #gluster
09:04 glusterbot New news from resolvedglusterbugs: [Bug 822253] Poor disk performance <http://goo.gl/jP9JX> || [Bug 823868] Rebalance and remove-brick does not obey quota usage limits <http://goo.gl/ZIiH2>
09:09 jtux joined #gluster
09:29 mohankumar joined #gluster
09:34 zhuyb joined #gluster
09:34 vpshastry1 joined #gluster
09:50 milos_ joined #gluster
09:53 face|less joined #gluster
10:30 sgowda joined #gluster
10:34 zhuyb joined #gluster
11:04 zhuyb joined #gluster
11:05 Humble joined #gluster
11:05 hagarth joined #gluster
11:07 glusterbot New news from newglusterbugs: [Bug 890502] glusterd fails to identify peer while creating a new volume <http://goo.gl/5LWrp>
11:19 vpshastry joined #gluster
11:27 mohankumar joined #gluster
11:34 zhuyb joined #gluster
11:45 yinyin joined #gluster
11:52 vpshastry joined #gluster
11:58 melanor9 joined #gluster
12:02 nullck joined #gluster
12:05 yinyin joined #gluster
12:07 glusterbot New news from newglusterbugs: [Bug 890509] kernel compile goes into infinite loop <http://goo.gl/ax3HE>
12:07 nullck joined #gluster
12:08 nullck joined #gluster
12:13 nullck joined #gluster
12:21 yinyin joined #gluster
12:24 ctria joined #gluster
12:27 andreask joined #gluster
12:27 yinyin joined #gluster
12:31 sgowda joined #gluster
13:19 hurdman joined #gluster
13:20 hurdman hi, is there a way to see what'is the : Another operation is in progress, please retry after some time
13:21 hurdman because, i don't know why and what is it and i have to create a new volume ^^"
13:23 melanor9 i'd grep sources for the error message
13:26 hurdman ( gluster 3.2.7 )
13:26 hurdman ~/sources/glusterfs-3.2.7# grep -Rni "Another operation is in progress" .    <= no result :/
13:27 melanor9 dammit
13:27 hurdman :'(
13:28 hurdman ./xlators/mgmt/glusterd/sr​c/glusterd-handler.c:882:                snprintf (err_str, sizeof (err_str), "Another operation is in "
13:28 hurdman ahaha
13:31 dastar joined #gluster
13:32 hurdman i restart glusterd
13:32 hurdman it seems to unlock
13:44 hurdman :/ i can't create my volume :'(
13:46 ctria joined #gluster
13:52 chirino joined #gluster
13:54 hurdman http://pastebin.com/ZWhrE0yT <= more info here
13:54 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
13:54 hurdman http://fpaste.org/CJOM/ <= more info here
13:54 glusterbot Title: Viewing Paste #263073 (at fpaste.org)
13:58 vpshastry joined #gluster
13:59 raven-np joined #gluster
14:01 vijaykumar left #gluster
14:04 samppah @yum repo
14:04 glusterbot samppah: kkeithley's fedorapeople.org yum repository has 32- and 64-bit glusterfs 3.3 packages for RHEL/Fedora/Centos distributions: http://goo.gl/EyoCw
14:32 mohankumar joined #gluster
14:50 stat1x joined #gluster
15:01 nightwalk joined #gluster
15:08 dbruhn joined #gluster
15:16 stopbit joined #gluster
15:21 wushudoin joined #gluster
15:38 neofob joined #gluster
15:56 hagarth left #gluster
15:57 hagarth joined #gluster
16:14 guigui3 left #gluster
16:24 raven-np joined #gluster
16:31 dbruhn Anyone have any opinions on NFS vs the Gluster Fuse client when performance is the issue?
16:31 dbruhn I am running a distributed replicated volume
16:34 vpshastry joined #gluster
16:35 semiosis dbruhn: my opinion is that you should try both and see what works better for you in your setup
16:37 dbruhn It's a production environment that's really hard to take down to make changes to. So I would love to play with some trial and error, just lacking in opportunity.
16:41 bulde joined #gluster
16:48 swinchen_ joined #gluster
16:49 swinchen_ hi all.  I have a strange problem.  When I do a "peer probe" it says that "10.20.0.2 is localhost" although it clearly isn't.  Any ideas?
16:49 semiosis swinchen_: did you clone your servers?
16:50 swinchen_ semiosis: no.  I did upgrade gluster from 3.2.7 -> 3.3.1
16:51 semiosis weird
16:51 semiosis if you're upgrading then why do you need to probe?
16:51 swinchen_ the update wasn't very clean because I forgot to stop the volume.
16:51 semiosis oops
16:52 swinchen_ I detached the peer after messing up the upgrade.
16:52 swinchen_ trying to "start fresh" if you will.
16:52 ctria I don't think that peer probing  has to do with volumes
16:53 ctria hm... i had to wait for this "...dettached peer..."
16:53 swinchen_ No, but right now I am trying to create a volume and I can't probe any peers.   Peer status returns "no peers"
16:53 semiosis swinchen_: well if you really want to start fresh, then stop glusterd, move all the contents of /var/lib/glusterd out to a backup location, and restart glusterd
16:53 ctria semiosis, doesn't this include hooks too?
16:54 semiosis ctria: hooks?
16:54 * ctria starts his VM...
16:54 semiosis what hooks?
16:54 swinchen_ [root@test1 ~]# gluster peer probe test2
16:54 swinchen_ Probe on localhost not needed
16:55 swinchen_ that was after stopping glusterd, deleting /var/lib/glusterd and restarting the service
16:55 swinchen_ wtf.  odd.
16:55 ctria swinchen_, pinging test2 pings localhost or the other peer?
16:55 semiosis double check your name resolution... hosts file, dns, whatever
16:55 swinchen_ other peer
16:56 semiosis swinchen_: you'll need to stop glusterd, (re)move /var/lib/glusterd, restart glusterd on the other server(s) as well
16:56 swinchen_ test1 is 10.20.0.1/16 and test2 is 10.20.0.2/16    this step worked fine in 3.2.7
16:56 swinchen_ semiosis: I did that on both servers
16:57 semiosis please pastie.org your /var/log/glusterd/etc-glusterfs-glusterd.log file
16:57 ctria semiosis, maybe the hooks are only part of RHS
16:57 semiosis from the host where you're running the probe commands
16:59 swinchen_ ok, one sec.
16:59 melanor91 joined #gluster
16:59 melanor91 hi
16:59 glusterbot melanor91: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
17:00 melanor91 i have  a configuration of a 4  nodes with multiple bricks and replica-count 2
17:00 melanor91 one of the  bricks failed as a hardware, and was replaced
17:00 melanor91 now i have a new hdd in place, how do i correctly insert in in gluster ?
17:01 melanor91 Status of volume: imgbb
17:01 melanor91 Gluster process                                         Port    Online  Pid
17:01 melanor91 ---------------------------------------​---------------------------------------
17:01 melanor91 Brick baby-i3:/opt/sdb1/brick                           24035   Y       18967
17:01 melanor91 Brick baby-i4:/opt/sdb1/brick                           24035   Y       30219
17:01 swinchen_ semiosis: http://pastebin.com/qTM2us7V
17:01 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
17:01 melanor91 joined #gluster
17:02 swinchen_ semiosis: http://pastebin.com/qTM2us7V  <--- not sure if got this with melanor91 flooding the channel.
17:02 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
17:03 swinchen_ http://fpaste.org/f9Wu/
17:03 glusterbot Title: Viewing Paste #263113 (at fpaste.org)
17:06 GLHMarmo1 joined #gluster
17:06 swinchen_ formatting is better here http://dpaste.org/DUD9A/ semiosis
17:06 glusterbot Title: dpaste.de: Snippet #215686 (at dpaste.org)
17:11 GLHMarmot joined #gluster
17:15 mynameisdeleted so.. solution.. need to read few large files fast... need to read lots of small files fast as well
17:16 mynameisdeleted need parallel read performance form different computers accessing differnet files on the saqme networked filesystem
17:16 mynameisdeleted need support for windows, mac , linux
17:16 mynameisdeleted need different access to support most bw goign over infiniband and not ethernet or wireless
17:16 mynameisdeleted solution.. raid5/6 + 6 disks per node + ext4... clustered between nodes and replicated between nodes with gluster
17:17 mynameisdeleted raid5 is used to read a single gigabyte file in a second or so for photoshopping gigapixel photos etc on mac
17:17 mynameisdeleted or linux
17:17 mynameisdeleted also used to load large single video files
17:17 mynameisdeleted gluster is used so one persons read doesnt slow down another persons read and N reads can happen on N nodes withotu performance interference
17:18 mynameisdeleted its also used to make the root fs fault tollerant to a single node poweroff or reboot
17:18 mynameisdeleted so that a single comptuer crash cant crash everyones computer
17:18 vpshastry joined #gluster
17:18 mynameisdeleted (right now it freezes it til the nfs server reboots and then resumes as normal)
17:19 mynameisdeleted so instead of raid6+ext4+nfs  I use raid6+ext4+multipleservers+glusterfs+nfs
17:19 mynameisdeleted with nfs being only existant on mac and windows clients that I dotn wish to install gluster on
17:19 mynameisdeleted any thoughts?
17:20 mynameisdeleted I'd like to ue dns load balancing and an nfs hostname so different windows/mac machiens use differnet nfs servers to get the same filesystem and distribute load better as well
17:21 semiosis xfs is recommended over ext4 for glusterfs
17:21 semiosis with inode size 512
17:23 eryc joined #gluster
17:24 cicero semiosis: what's the reasoning behind using xfs over ext4, and with that inode size?
17:24 isomorphic joined #gluster
17:25 Mo___ joined #gluster
17:26 elyograg cicero: recent linux kernel changes, backported to many current distros (including rhel/centos6), have broken glusterfs on ext4.  as for the inode size, glusterfs stores extended attributes (xattr) on each file.  with the standard inode size of 256, these often will cause the file entry to exceed one inode, meaning you have an extra disk seek for the second inode every time you access that file. disk seeks are expensive.
17:27 cicero very useful to know
17:27 cicero unfortunately i just deployed an ext4-based 2-node volume :(
17:27 cicero 2-replica bricks
17:27 elyograg don't upgrade the kernel. ;)
17:28 cicero i shan't
17:28 cicero do you know what's the latest safe kernel?
17:28 elyograg let's see if I can get glusterbot to tell us about it.  ,,(ext4)
17:28 glusterbot Read about the ext4 problem at http://goo.gl/PEBQU
17:28 cicero thank you
17:29 semiosis elyograg: i couldn't have said it better myself!
17:29 cicero gosh that sounds like a nasty bug
17:30 semiosis cicero: the change came in mainline linux 3.3.0, but it's been backported to redhat (centos, etc) kernels in the 2.6 branch
17:30 cicero thanks
17:30 semiosis so basically all the latest centos kernels are affected, and ubuntu kernels starting with the quantal (12.10) release... precise is safe (for now) with it's 3.2-series kernel
17:31 cicero yeah, phew
17:32 elyograg No fix is currently available in gluster, I don't know what the status of that is.  One would hope for 3.3.2, but it looks like one of the key developers has vetoed the patch as it was on Dec 12.
17:32 swinchen_ semiosis: did you happen to look at the log files I pasted?
17:33 semiosis yeah nothing helpful there
17:34 swinchen_ shoot.  I think I might downgrade.  :/  3.3 has some features I would really like though.
17:34 swinchen_ thanks for looking
17:34 nueces joined #gluster
17:36 semiosis yeah you should upgrade
17:36 semiosis but fix your name resolution problems first
17:36 semiosis are you sure you dont have the hostnames mixed up?
17:37 semiosis or wrong entries in /etc/hosts?
17:37 semiosis or maybe old version of glusterd still running?
17:37 semiosis just throwing out ideas here
17:37 swinchen_ semiosis: I am positive.  I can paste the hosts files if you want.
17:39 swinchen_ no.. no old version of gluster running.  I am going to remove all gluster packages and reinstall
17:46 swinchen_ no help.  wtf.   Maybe I will reboot the servers.
17:47 Humble joined #gluster
17:47 semiosis thats bizarre
17:48 swinchen_ I did a locate on the word gluster and removed everything too.  Should I only be starting glusterd or glusterfsd as well?
17:50 semiosis just glusterd
17:50 swinchen_ ok good.  that is what I was doing.  hrmm...
17:50 swinchen_ I am using CentOS..  I wonder if there is some oddity I am unaware of.
17:51 swinchen_ I have selinux disabled.
17:53 vpshastry left #gluster
17:55 swinchen_ nope, that didn't help either.  How could it think 10.20.0.1 is the localhost.   ugh!
17:55 swinchen_ I have tried entering the IP manually too, instead of just using the hosts file.
17:56 swinchen_ [root@test2 ~]# gluster peer probe 10.21.0.45
17:56 swinchen_ Probe on localhost not needed
17:57 swinchen_ even more strange... I don't even have a machine with that IP on the network
18:00 elyograg initial thoughts - firewall software, ssh port forwarding or ssh socks proxy gone horribly wrong ...
18:03 swinchen_ no forwarding setup...  I will stop iptables to see if that helps
18:03 dhsmith joined #gluster
18:03 elyograg centos defaults to selinux being on as well.  doesn't seem to me like that could cause problems, but i always turn it off anyway.
18:05 swinchen_ elyograg: I have it turned of as well (in the config file).   The strange thing is the version in the epel repo (3.2.7) worked fine.  No idea what changed in 3.3.1 (or the packaging thereof) that might cause this.
18:05 dbruhn did you downgrade to 3.2.7 and have it continue working?
18:05 semiosis swinchen_: pastie.org the output of ifconfig please
18:06 dhsmith joined #gluster
18:06 semiosis swinchen_: from the host with probe problems
18:06 swinchen_ both hosts have the problem.  I will pasted test1
18:06 elyograg swinchen_: output from 'route -n' wouldn't be a bad thing either.
18:08 melanor9 joined #gluster
18:10 swinchen_ here is my ip, hosts, and route -n for both hosts:  http://dpaste.org/D0owK/
18:10 glusterbot Title: dpaste.de: Snippet #215694 (at dpaste.org)
18:11 swinchen_ dbruhn: I haven't tried a downgrade yet
18:17 swinchen_ anything look out of place?
18:24 semiosis swinchen_: doubtful anything in 3.3.1 changed to cause this... or more people would be reporting similar problems
18:24 semiosis and by "more" i mean anyone else
18:24 semiosis :)
18:25 semiosis any iptables NAT going on?
18:25 semiosis or other exotic network config?
18:25 semiosis also can you double check that /var/lib/glusterd/glusterd.info has *different* contents on both of your gluster servers test1 & test2
18:33 swinchen_ it does have different uuid stuff.   Iptables is stopped.  Only thing I have that might be a little odd is net.ipv4.ip_nonlocal_bind = 1
18:50 melanor9 joined #gluster
19:00 obryan joined #gluster
19:08 atrius joined #gluster
19:18 semiosis when did ip_nonlocal_bind change?
19:20 semiosis http://gluster.org/pipermail/glu​ster-users/2012-July/010833.html
19:20 glusterbot <http://goo.gl/l2lST> (at gluster.org)
19:20 semiosis swinchen_: ^^^
19:21 semiosis that's a new one for me
19:21 semiosis very interesting
19:32 atrius joined #gluster
19:40 duffrecords joined #gluster
19:42 ron-slc left #gluster
19:46 vex so 'gluster volume heal <volume> info split-brain' just shows me a lot of gfid: information. How to I parse that into something useful? (Or what do I do with it?)
19:50 Cenbe When using glusterfs to store virtual machine disk images (kvm/qemu), what is the preferred format for those images? raw? qcow2?
20:02 duffrecords I'm rebuilding a GlusterFS system set up by a previous sysadmin and finding it was not set up optimally.  half of the servers have bricks that are 1.5x larger.  is it safe to use it this way or should I partition them to the same size as the smaller bricks?  I'd rather preserve the disk space but if it comes at the cost of performance that's not worth it.
20:06 semiosis it's safe though probably not optimal
20:16 duffrecords as in a performance hit?
20:21 elyograg duffrecords: the extra space on the larger bricks likely won't get used.  the volume will say it has free space, but you'll run into problems when the smaller bricks fill up.
20:26 duffrecords if you were faced with 4 software RAID 10 bricks, two of which were 2.8 TB and two were 4.1 TB, would you resize the larger ones?
20:32 dbruhn You can use the quota in the config that will make it so a brick won't fill beyond a certain level
20:33 dbruhn it's helpful for situations like this where the system has non uniform brick sizes
20:34 daMaestro joined #gluster
20:35 duffrecords does the quota apply to the entire volume or on a per brick basis?
20:37 dbruhn page 33 of the gluster 3.3.0 admin guide
20:37 dbruhn cluster.min-free-disk
20:37 dbruhn Specifies the percentage of disk space that must be kept free. Might be useful for non-uniform bricks.
20:38 bauruine joined #gluster
20:45 duffrecords thanks.  so if I understand correctly, the general consensus here seems to be that bricks should be uniform, whether that means partitioning them to the same size or using a quota to prevent extra space in the larger bricks from being used.  am I right?
20:45 daMaestro partitioning
20:45 daMaestro quotas would likely piss off gluster (it would think it has space and try to write)
20:45 daMaestro i've always understood that brick sizes should be the same, but that might have changed in newer releases
20:49 duffrecords I want to err on the side of caution.  we've lost VMs before due to bad storage configurations and that makes the rabble get out their torches and pitchforks.  I think I'll shrink the partitions to be absolutely sure.
20:50 dbruhn Probably the best idea
21:04 semiosis +1
21:05 vex can anyone tell me how to fix split brain stuff when all I get returned is gfids?
21:07 vex e.g: http://dpaste.org/ACAPF/
21:07 glusterbot Title: dpaste.de: Snippet #215697 (at dpaste.org)
21:12 bauruine joined #gluster
21:20 semiosis ~split-brain | vex
21:20 glusterbot vex: (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
21:22 vex thanks, but the files aren't listed, just gfids (as the paste shows)
21:24 semiosis vex: why do you think these are split brain?
21:25 semiosis is that what gluster volume heal info reports?
21:26 vex sorry, helps if i paste the split-brain result: http://dpaste.org/HcegK/
21:26 glusterbot Title: dpaste.de: Snippet #215698 (at dpaste.org)
21:28 semiosis how did you get into a split brain situation?
21:28 semiosis if you dont prevent it, it's probably going to happen again
21:28 vex I have no idea.
21:28 semiosis did you read the how to cause split brain link?
21:29 semiosis are you doing either of those things, or are you accessing brick directories directly (not through glusterfs client mounts)?
21:29 vex i suspect it might have been some sort of server failure
21:29 vex we're not writing to the bricks directly
21:30 semiosis alternating server failures is one of the situations presented in that article, and a sure way to cause split brain
21:32 vex so given the situation I have is there an easy (heh) way of fixing up the split-brain?
21:34 semiosis well i suppose the *easy* way is to wipe out one brick and let glusterfs re-sync everything from the other brick
21:34 vex the bricks are about 900G each :/
21:35 semiosis ,,(meh)
21:35 glusterbot I'm not happy about it either
21:36 semiosis did you read joe's article about healing split-brain on 3.3?
21:36 semiosis glusterbot linked to it
21:36 vex yep
21:36 vex i'm using 3.3.1
21:36 semiosis ok
21:36 vex I could go through and remove the listed gfids, I guess
21:36 vex easier than a 900G resync
21:37 semiosis you can resolve gfids to filenames by looking in the .glusterfs dir on the bricks
21:37 vex 'easier' / less time
21:37 vex ah
21:37 semiosis you'll find files/dirs named by gfid in .glusterfs
21:37 semiosis dirs are symlinks to their actual dirs in the brick
21:37 semiosis files are hard links to their actual files in the brick
21:38 semiosis symlinks are easy to follow of course
21:38 semiosis files otoh you'll need to get their inode number by using ls -i, then doing a find in the brick for that inode number, using find -inum
21:39 vex a tool to do this would be handy :)
21:39 vex grepping out the gfid or something to make it easy to rm
21:39 vex the shell variables from that link help, though.
21:40 swinchen_ semiosis: thanks for the link.  Unfortunately I really need non_local_bind for haproxy.  :.
21:40 nueces joined #gluster
21:45 bauruine joined #gluster
21:48 dhsmith joined #gluster
21:48 semiosis swinchen_: did gluster 3.2.x work with this same ip_nonlocal_bind enabled?
21:49 semiosis sorry i doubted you earler :)
21:49 semiosis thought i'd seen it all but this was a new one
21:49 * semiosis a fool
21:51 melanor9 joined #gluster
21:53 vex hrm
21:54 vex something like GFID=$(gluster volume heal storage info|grep gfid|cut -d ":" -f2|cut -d ">" -f1);rm ${GFID:0:2}/${GFID:2:2}/${GFID}
21:55 duffrecords I'm reading this guide on tuning the kernel for GlusterFS: http://community.gluster.org/a/li​nux-kernel-tuning-for-glusterfs/ and I was wondering—could setting vm.swappiness to 0 be detrimental if each system only has 4 GB of RAM?
21:55 glusterbot <http://goo.gl/URHmU> (at community.gluster.org)
22:02 bauruine joined #gluster
22:13 y4m4 joined #gluster
22:15 melanor9 duff: whats your current memory footprint ?
22:16 melanor9 Gents, what are those unnamed entries in failed log ?
22:16 melanor9 Brick baby-i1:/opt/sdc1/brick
22:16 melanor9 Number of entries: 56
22:16 melanor9 at                    path on brick
22:16 melanor9 -----------------------------------
22:16 melanor9 2012-12-28 02:05:31 <gfid:84ef8dd5-d000-46e5-b1c5-fd06a92535f6>
22:16 melanor9 2012-12-28 02:05:31 <gfid:fb14dae5-9f6e-437b-9cda-65d439f38102>
22:16 melanor9 2012-12-28 02:05:31 <gfid:ac01760e-e642-4292-9b2b-a7e1b170d62d>
22:24 semiosis vex: https://gist.github.com/4392640
22:24 glusterbot 'Title: Glusterfs GFID Resolver\r\rTurns a GFID into a real path in the brick (at gist.github.com)'
22:24 semiosis cheesy but effective, like all my shell scripts
22:24 vex <3
22:25 vex I was just writing the same sort of thing
22:25 vex heh
22:25 semiosis @learn gfid resolver as https://gist.github.com/4392640
22:25 glusterbot semiosis: The operation succeeded.
22:25 semiosis @gfid resolver
22:25 glusterbot semiosis: https://gist.github.com/4392640
22:25 vex yours is a lot saner :)
22:25 semiosis hehe
22:25 semiosis ty
22:30 melanor9 ty guys
22:31 swinchen_ semiosis: yep.  3.7.2 worked fine
22:31 semiosis wow!
22:31 swinchen_ errr 3.2.7
22:33 melanor9 is it ok to have some healing errors with conflicting entries
22:34 vex semiosis: if I trust one brick over the other, would it be safe to just remove the GFID?
22:35 GLHMarmot joined #gluster
22:35 vex rather than resolve the GFID and remove the file
22:39 semiosis vex: updated with optional -q
22:40 vex :D
22:41 semiosis vex: according to JoeJulian's how-to-fix ,,(split-brain) article, you need to remove both the gfid file in .glusterfs and the real file
22:41 glusterbot vex: (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
22:41 semiosis if i am reading that right
22:41 vex ok
22:41 semiosis then glusterfs should replace it by copying from the "good" replica
22:45 kaos01 joined #gluster
22:50 duffrecords melanor9: historically, about 25% of the memory was used and the rest was cached memory
22:54 berend joined #gluster
22:55 melanor9 then i think you're good for swappiness 0
22:55 melanor9 generally if you dont have any memory deficit - you dont need swap
22:56 duffrecords thank you
23:29 duffrecords regarding disk tuning, if my volumes are going to be used exclusively for large virtual machine disk images, will a higher read-ahead value help?
23:29 duffrecords I'm not sure how fragmentation works inside a disk image
23:34 semiosis google is really annoying
23:36 semiosis @lucky ben england red hat storage
23:36 glusterbot semiosis: http://goo.gl/lpGAU
23:36 semiosis duffrecords: ^^ (pdf)
23:36 semiosis maybe that will be helpful
23:46 daMaestro joined #gluster
23:53 duffrecords thanks.  maybe I will start with 4096 as the kernel tuning article I mentioned earlier suggests and see how the performance is at that point
23:56 duffrecords also, once the volume has been set up, I'm going to be restoring files from a backup of the old GlusterFS installation.  what I'm worried about is http://joejulian.name/blog/glusterfs-path-or​-a-prefix-of-it-is-already-part-of-a-volume/
23:56 glusterbot <http://goo.gl/YUzrh> (at joejulian.name)
23:57 duffrecords am I going to have to do that recursively for every directory or just for the root directory of the volume?
23:58 semiosis brick root dir
23:58 semiosis and possibly parents of that
23:58 semiosis but not children within the brick
23:59 duffrecords ok, that's good

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary