Camelia, the Perl 6 bug

IRC log for #gluster, 2013-01-21

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:26 abkenney joined #gluster
00:49 raven-np joined #gluster
00:58 xmltok joined #gluster
01:04 sjoeboo_ joined #gluster
01:09 xmltok joined #gluster
01:26 kevein joined #gluster
01:27 xmltok joined #gluster
01:42 greylurk joined #gluster
02:11 sgowda joined #gluster
02:20 raven-np joined #gluster
02:43 RicardoSSP joined #gluster
02:52 sashko joined #gluster
02:57 bharata joined #gluster
04:07 hagarth joined #gluster
04:30 lala joined #gluster
04:32 lala__ joined #gluster
04:37 vpshastry joined #gluster
04:43 lala_ joined #gluster
04:44 Oneiroi joined #gluster
04:45 deepakcs joined #gluster
04:58 bulde joined #gluster
04:59 overclk joined #gluster
05:04 sripathi joined #gluster
05:15 hagarth joined #gluster
05:23 rastar joined #gluster
05:52 rastar1 joined #gluster
05:52 Humble joined #gluster
05:55 raghu joined #gluster
06:05 sripathi joined #gluster
06:10 bulde joined #gluster
06:14 sripathi1 joined #gluster
06:17 melanor9 joined #gluster
06:19 sgowda joined #gluster
06:20 vpshastry joined #gluster
06:30 spn joined #gluster
06:36 sashko joined #gluster
06:37 sripathi joined #gluster
06:46 shireesh joined #gluster
06:52 bzf130_mm joined #gluster
06:53 guigui1 joined #gluster
06:58 sgowda joined #gluster
07:01 ngoswami joined #gluster
07:04 sripathi joined #gluster
07:11 zwu joined #gluster
07:11 jtux joined #gluster
07:13 ramkrsna joined #gluster
07:13 ramkrsna joined #gluster
07:13 glusterbot New news from newglusterbugs: [Bug 879078] Impossible to overwrite split-brain file from mountpoint <http://goo.gl/eR0Ki>
07:19 shireesh joined #gluster
07:19 hagarth joined #gluster
07:25 sripathi joined #gluster
07:26 vimal joined #gluster
07:37 ekuric joined #gluster
07:43 glusterbot New news from newglusterbugs: [Bug 885281] Swift integration Gluster_DiskFIle.unlink() performs directory listing before deleting file ... scalable? <http://goo.gl/pR86P> || [Bug 885424] File operations occur as root regardless of original user on 32-bit nfs client <http://goo.gl/BiF6P> || [Bug 895528] 3.4 Alpha Tracker <http://goo.gl/hZmy9>
07:44 vpshastry1 joined #gluster
07:44 rastar joined #gluster
07:52 shireesh joined #gluster
07:52 Nevan joined #gluster
07:54 Nevan joined #gluster
07:56 hagarth joined #gluster
08:01 jtux joined #gluster
08:01 Nevan joined #gluster
08:02 bulde joined #gluster
08:04 kevein joined #gluster
08:07 ngoswami joined #gluster
08:13 ctria joined #gluster
08:16 Joda joined #gluster
08:19 andreask joined #gluster
08:25 _br_ joined #gluster
08:30 _br_ joined #gluster
08:38 sripathi joined #gluster
08:38 _br_ joined #gluster
08:39 Nevan hmm in which version is the nfs client to replicate storage working? because in 3.2.1  its not working
08:43 _br_ joined #gluster
08:43 duerF joined #gluster
08:47 _br_ joined #gluster
08:49 Norky joined #gluster
08:54 _br_ joined #gluster
08:57 _br_ joined #gluster
09:00 tjikkun_work joined #gluster
09:01 _br_ joined #gluster
09:01 melanor9 joined #gluster
09:02 sripathi joined #gluster
09:06 _br_ joined #gluster
09:13 jh4cky joined #gluster
09:18 gbrand_ joined #gluster
09:19 DaveS joined #gluster
09:20 sripathi joined #gluster
09:21 bharata joined #gluster
09:21 Norky joined #gluster
09:28 rastar joined #gluster
09:30 dobber joined #gluster
09:34 spn joined #gluster
09:37 vpshastry joined #gluster
09:38 Azrael808 joined #gluster
09:39 raven-np joined #gluster
10:17 pai joined #gluster
10:19 guigui3 joined #gluster
10:19 srhudli joined #gluster
10:23 anmol joined #gluster
10:27 lala joined #gluster
10:27 kevein joined #gluster
10:29 ndevos Nevan: the nfs-client does not replicate, that is the job of the nfs-server
10:30 tryggvil joined #gluster
10:31 ram joined #gluster
10:32 Guest57471 exit
10:32 x4rlos Guest57471: we all been there
10:34 Nevan if i have servers , replicate to each other
10:35 ram_ joined #gluster
10:35 Nevan the nfs client connects to one of the servers via nfs, on the replicated storage, is it then replecated or not
10:35 Nevan ?
10:36 ram_ joined #gluster
10:36 Nevan because in 3.2.1 there is not replication between the 2 servers.. when a client connects to the nfs and writes data to the replicated storage
10:40 pai_ joined #gluster
10:42 rcheleguini joined #gluster
10:42 glusterbot New news from resolvedglusterbugs: [Bug 888743] tests/basic/rpm.t fills /var/tmp <http://goo.gl/OsxBE>
10:45 pai joined #gluster
10:48 hagarth @channelstats
10:48 glusterbot hagarth: On #gluster there have been 74228 messages, containing 3303508 characters, 553504 words, 2266 smileys, and 290 frowns; 553 of those messages were ACTIONs. There have been 25410 joins, 918 parts, 24532 quits, 9 kicks, 43 mode changes, and 5 topic changes. There are currently 191 users and the channel has peaked at 193 users.
10:49 hagarth 3 short of bettering 193 :)
11:01 Nr18 joined #gluster
11:02 bala joined #gluster
11:02 sripathi1 joined #gluster
11:05 raven-np1 joined #gluster
11:18 melanor91 joined #gluster
11:19 tryggvil joined #gluster
11:25 shireesh joined #gluster
11:35 spn joined #gluster
11:37 spn joined #gluster
11:37 vimal joined #gluster
11:39 ekuric joined #gluster
11:43 pai joined #gluster
11:45 shireesh joined #gluster
11:45 bala joined #gluster
11:48 guigui1 joined #gluster
11:53 edward joined #gluster
11:56 bauruine joined #gluster
11:57 sripathi joined #gluster
11:57 pai_ joined #gluster
11:58 pai__ joined #gluster
11:59 anmol joined #gluster
12:00 pp joined #gluster
12:00 Guest61311 left #gluster
12:08 toruonu joined #gluster
12:14 toruonu argh… it seems we have a problem that's hard to solve… Namely the mounting of volumes on clients.
12:15 toruonu If we use fuse based mount, then it doesn't have any kind of lookup cache and most bricks spend 90+% of the time on lookups as python etc codes trace through their paths looking for non-existing includes/libraries.
12:15 toruonu if we use nfs based mounts, then we get caching, but are at risk of nfs lockups that seem to happen from time to time
12:16 toruonu and it's especially bad if we have to restart one of the gluster nodes for what ever reason
12:16 toruonu then all clients tied to that NFS server stay in some weird state
12:16 toruonu most times it means we end up having to stop the client nodes, unmount, remount, start them again
12:16 toruonu don't see this behavior if we use gluster fuse mount, but then it's the lookup issue that makes everything crawl
12:17 toruonu is there any way to set up caching in fuse? or through some workaround
12:19 hagarth joined #gluster
12:19 spn joined #gluster
12:24 pai joined #gluster
12:24 plarsen joined #gluster
12:25 rwheeler joined #gluster
12:27 Azrael808 joined #gluster
12:28 x4rlos As for conventions. When people create a volume on gluster, do you use gv0 or a more meaningful name such as user-shares ?
12:29 x4rlos (i know it doesn't really matter - just wondering)
12:36 toruonu I have home0 :)
12:36 toruonu for home
12:36 x4rlos hehe. Just wondering on concensus.
12:40 kkeithley the-me, semiosis: yes catch. There's a revised patch at the same location as before.
12:42 isomorphic joined #gluster
12:49 toruonu jdarcy: you around? would your negative lookup translator implementation be difficult to integrate into the 3.3.1 install from repos?
12:49 abkenney joined #gluster
12:50 isomorphic joined #gluster
12:51 Azrael808 joined #gluster
12:54 andreask joined #gluster
12:55 raven-np joined #gluster
12:56 isomorphic joined #gluster
12:58 balunasj joined #gluster
13:02 Nr18 joined #gluster
13:10 isomorphic joined #gluster
13:18 toruonu seriously guys … how do I get past the 99% time spent on lookup?
13:18 toruonu the system is effectively unusable
13:18 toruonu it seems to be some time aggregated effect as when I moved back from nfs mount to glusterfs mount then for a day or two things looked pretty neat
13:18 toruonu things worked fast etc
13:19 toruonu but now it's impossible, even a simple ls takes half a minute
13:19 isomorphic joined #gluster
13:20 toruonu hmm looking at the gluster node it's very overloaded
13:20 toruonu PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13:20 toruonu 5826 root      18   0  533m  92m 2136 S 734.3  0.2   1788:01 glusterfsd
13:21 toruonu ok, stopping profiling (which I had recently started) did reduce it
13:21 toruonu but at times it's still 600+%
13:23 x4rlos that's pretty ugly.
13:23 toruonu and those are no pushover nodes, they have dual Intel E5600 series CPU's and 48GB ram
13:26 toruonu stopping glusterfsd and glusterd left processes running
13:29 abkenney Is anyone providing commercial support for Gluster for Ubuntu? We're working on upgrading to 3.3 and interested in support.
13:30 x4rlos Im wondering if anyone has tried to put gluster as part of the linux kernel.
13:31 johnmark exit
13:31 johnmark oops :)
13:32 toruonu how can I monitor what gluster's really doing … logs seem to be mostly silent with only occasional entries that don't really give me insights
13:32 isomorphic joined #gluster
13:33 nueces joined #gluster
13:36 dustint joined #gluster
13:45 isomorphic joined #gluster
13:45 aliguori joined #gluster
13:52 torbjorn__ x4rlos: discussion regarding in-kernel gluster comes up on the e-mail list now and again, this thread has some of the main points: http://gluster.org/pipermail/glust​er-users/2012-December/035093.html
13:52 glusterbot <http://goo.gl/IoyAe> (at gluster.org)
14:01 Nr18 joined #gluster
14:02 toruonu does anyone else run gluster as /home? I'd like configuration experiences on how it works for you? Users tend to run various different apps that have different behavior and therefore it'd be nice to know what works and what doesn't
14:05 toruonu hmm… reading from http://community.gluster.org/q/wha​t-is-the-best-way-to-use-gluster/ it seems /home is not recommended on gluster :)
14:05 glusterbot <http://goo.gl/rvNqw> (at community.gluster.org)
14:12 chirino joined #gluster
14:14 isomorphic joined #gluster
14:22 isomorphic joined #gluster
14:24 _br_ joined #gluster
14:25 pkoro joined #gluster
14:25 melanor9 joined #gluster
14:28 hateya joined #gluster
14:28 toruonu btw is  there some way I could convert a 3-way replicated volume to 2-way replicated volume and remove the 3rd replica from the configuration altogether?
14:29 kkeithley toruonu: `gluster remove brick $volname replica 2 $brickname` ought to do the trick
14:29 toruonu well if it's 12 disk 3-way replication, then I have to remove 4 bricks to make it 2-way
14:30 toruonu can I just put all the bricks in the end or ....
14:31 kkeithley yes, list all the bricks you want to remove
14:36 _br_ joined #gluster
14:37 isomorphic joined #gluster
14:40 toruonu do I remember right, that the brick listing in gluster volume info shows in sequence the replica groups. So if I have 12 bricks with 3-way replication, then the first 3 bricks are the first replica group, second 3 are the second group etc
14:42 _br_- joined #gluster
14:44 ndevos toruonu: you could check https://raw.github.com/nixpanic/lsgvt/master/lsgvt and list your volume-topology in a tree format
14:44 glusterbot <http://goo.gl/7QGX8> (at raw.github.com)
14:46 toruonu hehe, the python on SL5.7 is too old (2.4) to work with it I think :)
14:47 toruonu syntax errors
14:47 melanor9 joined #gluster
14:48 _br_ joined #gluster
14:48 gbrand_ joined #gluster
14:50 Nr18 joined #gluster
14:54 _br_ joined #gluster
14:56 _br_ joined #gluster
14:58 chirino joined #gluster
14:58 pithagorians joined #gluster
14:59 VSpike If I have a replicated volume in 3.3, and I go to a server hosting a brick and delete a file, what will happen?
15:01 jiffe98 an angel will lose its wings?
15:01 rwheeler joined #gluster
15:01 pithagorians hello all. i have a cluster of 2 servers in replication. OS - Debian 6, 64 bit. And 4 clients reading / writing files on cluster. on one of the clients, sometimes, randomly  the partition is not accessible and when i try to ls or cd i get Input output error. any clue where to search the issue ?
15:04 _br_ joined #gluster
15:04 VSpike jiffe98: I was wonder more how many kittens would die
15:08 _br_ joined #gluster
15:08 jh4cky joined #gluster
15:09 VSpike I really have a whole bunch of questions that I can't find answers too. I'm primarily interested in replicated volumes. I'm wondering if Gluster stores metadata on the bricks (and if so where), how it deals with files being added/removed/modified on a single brick...
15:10 VSpike if you can safely stop a peer and then put it back later, and if you can make backups safely on the server by backing up only one brick.
15:10 wushudoin joined #gluster
15:13 pithagorians does gluster have performance issues ?
15:13 pithagorians on relative big amount of data?
15:14 _br_ joined #gluster
15:20 _br_ joined #gluster
15:24 _br_ joined #gluster
15:24 m0zes VSpike: the metadata is stored on the bricks. in ,,(extended attributes)
15:24 glusterbot VSpike: (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://goo.gl/Bf9Er
15:27 m0zes VSpike: if you delete a file directly on a brick, the self-heal daemon should find it an re-replicate it back from the other replica. if you don't want to wait for the self-heal daemon to find it you can stat the file from a client mount.
15:28 _br_ joined #gluster
15:28 m0zes it isn't recommended to mess with the bricks directly *unless* your file has ,,(split-brain) ed
15:28 glusterbot (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
15:35 chirino joined #gluster
15:35 _br_ joined #gluster
15:39 _br_ joined #gluster
15:40 bennyturns joined #gluster
15:41 msgq joined #gluster
15:44 _br_ joined #gluster
15:48 _br_ joined #gluster
15:57 _br_ joined #gluster
15:59 _br_ joined #gluster
16:01 toruonu hmmm…. I'm changing replica level from 3 to 2 and the process is running. It's claiming some 461 files already that it needed to rebalance though I can't quite fathom why considering that I'm removing a single brick from every replica group
16:03 toruonu basically I have 6 nodes, each contributing 2 bricks. The setup is such that both bricks belong to different replica groups therefore I'm removing 2 nodes out of 6 (total 4 bricks)… One has scanned 6101 files and found 0 to rebalance, the other scanned 6236 and needed 601 to rebalance
16:03 toruonu was it really that out of sync?
16:03 toruonu and is that healthy really ...
16:03 toruonu or is it doing something weird
16:04 VSpike m0zes: that's excellent stuff.. thanks!
16:04 toruonu ugh
16:04 toruonu it seems a lot of files are duplicates now!!!
16:04 toruonu stopping this shit… ugh
16:05 toruonu it seems gluster reconfigured the replica set without removing the replicas first so it's a mess now
16:05 _br_ joined #gluster
16:06 VSpike With doing backups on the server on replicated volumes, I suppose it would depend on whether "replica count" == "number of bricks"
16:06 toruonu that's the brick config I had before:
16:06 toruonu http://fpaste.org/txxs/
16:06 glusterbot Title: Viewing [root@se1 ~]# python26 lsgvt Topolog ... .245:/d35 │ │ │ └───�... ��── Brick 2: 192.168.1.240:/d36 (at fpaste.org)
16:06 toruonu so I took the last one from each replica set basically removing nodes with end IP's of 240 and 243
16:07 toruonu and that's what it's claiming now
16:07 toruonu http://fpaste.org/EpKG/
16:07 glusterbot Title: Viewing Topology for volume home0: Distribut ... 240:/d35 ? ????? Replica set 3 ? ? ? ... ike Unicode, try the --ascii option. (at fpaste.org)
16:07 toruonu ran this:
16:07 toruonu [root@se1 ~]# gluster volume remove-brick home0 replica 2 192.168.1.243:/d35 192.168.1.240:/d35 192.168.1.243:/d36 192.168.1.240:/d36  start
16:07 toruonu Remove Brick start successful
16:09 VSpike Bit of a risk to build a backup system that could easily be broken by someone just adding another peer
16:10 _br_ joined #gluster
16:11 toruonu anyone?
16:11 toruonu ideas what happened?
16:12 toruonu almost all of my files are duplicates, but doing ls will claim for half that they don't exist
16:12 toruonu gluster has decided for sure that it's now 6x2 replicated volume, not 4x3
16:12 toruonu this is bad
16:13 toruonu should I stop the whole gluster to prevent further data screwup?
16:13 toruonu and how do I recover from this
16:15 stigchri_ joined #gluster
16:16 * toruonu waves hands
16:16 toruonu wouldn't be pestering as much if I didn't think that time is of the essence here
16:17 _br_ joined #gluster
16:17 sjoeboo_ joined #gluster
16:20 toruonu it seems it was a bad idea to add start to the end I assumed it would be graceful, it would check that the bricks would be nicely in sync and then remove them
16:20 toruonu NO, it just reconfiged the bricks and then started balancing and now I have a huge mess here
16:21 _br_- joined #gluster
16:24 chirino joined #gluster
16:27 NuxRo kkeithley: it happened again, yum update brought in new glusterfs, knocking off one of my storage servers, http://fpaste.org/QVSQ/
16:27 glusterbot Title: Viewing Paste #269023 (at fpaste.org)
16:27 NuxRo that's last 500 lines in glusterfshd.log
16:27 NuxRo /var/log/glusterfs/glustershd.log even
16:28 _br_ joined #gluster
16:28 toruonu ok, quick check… do you guys even see my posts?
16:29 NuxRo toruonu: yes
16:29 toruonu ok, so noone has any idea how to at least stop things from getting worse?
16:29 NuxRo i suggest taking this on the mailing list
16:29 toruonu we're effectively at whole-site failure mode right now
16:30 toruonu I did
16:30 toruonu but I think this might need fast action … the longer the volume is online the more "damage" could happen
16:30 toruonu as people try to make heads or tails of what they see
16:31 toruonu this basically looks like a huge split-brain without it recognizing it's split-brain
16:31 _br_ joined #gluster
16:32 NuxRo no idea, mate, i can hardly solve my own problems :)
16:33 toruonu the only idea I have is that what if I only leave online one replica of everything
16:33 toruonu the first in each replica set
16:33 toruonu the original one that is
16:33 kkeithley NuxRo: you did a yum update on the live (running) system?
16:35 _br_ joined #gluster
16:36 luis_alen joined #gluster
16:38 kkeithley toruonu: That seems reasonable, it's what I was going to suggest trying.
16:38 luis_alen hello, guys. I'm new to gluster and I'm trying to mount a test volume I just created and started. The client says "mount: unknown filesystem type 'glusterfs'". I've already installed fuse, fuse-libs and glusterfs on the client. Any tips? I tried -t gfs as well...
16:38 toruonu kkeithley: well it didn't keep the volume usable
16:38 NuxRo kkeithley: yes, the system is running, this is one of the machines that hosts bricks
16:38 toruonu I think at least one of the new replica blocks (or two) were left with 0 replicas and it didn't trust the config
16:38 toruonu I've since stopped the volume to think things over
16:39 toruonu I know for sure which nodes contain a good set of data
16:39 toruonu the only thing I can think of is rsync it all into one node one folder and share that out with NFS temporarily, then later recreate the volume and bring it back. But I'd prefer a gluster fix way if possible
16:40 toruonu I think the only way out would be to manually change gluster config while gluster is down everywhere
16:41 toruonu anyone got help on how to manually change this? I'm guessing /var/lib/glusterd/vols/home0/info is the one to change, but ...
16:42 toruonu but this looks like a major bug :/
16:43 _br_ joined #gluster
16:45 kkeithley toruonu: did you make copies of your vol files before you made the change? Could you restore and go back to 4x3 for a while?
16:45 toruonu I did not
16:45 toruonu had I known that kind of crap could happen I'd have done it :/
16:45 toruonu though I wonder … if I change manually the replicas 3
16:46 toruonu the file has:
16:46 toruonu replica_count=2
16:46 toruonu if I just set it to 3
16:46 toruonu on all nodes
16:46 toruonu would it go back to 4x3
16:47 kkeithley what about the bricks you removed?
16:48 VSpike kkeithley: should I disable automatic security updates on my gluster servers? :)
16:48 toruonu they are still there, I aborted the process
16:48 toruonu i.e. stop command
16:48 toruonu they still contain all the stuff
16:48 toruonu but I'm afraid it's not that easy
16:48 toruonu subvolumes home0-replicate-0 home0-replicate-1 home0-replicate-2 home0-replicate-3 home0-replicate-4 home0-replicate-5
16:49 toruonu so it seems to have done changes in more than 1 place
16:49 _br_ joined #gluster
16:50 toruonu wonder if there is a command that would recreate the gluster volume config using the original command
16:50 kkeithley VSpike: ?
16:51 toruonu I still have it in command history luckily
16:52 VSpike kkeithley: kkeithley | NuxRo: you did a yum update on the live (running) system?
16:52 VSpike kkeithley: just suddenly made me think.. oops... perhaps automatic updates not a good idea here
16:52 _br_ joined #gluster
16:53 kkeithley VSpike: hmm. Well, there does seem to be some brittleness. It might not be a bad idea.
16:54 toruonu soo…. any ideas how I could get the correct volume config from the original create command? can I create a new volume using this command without starting it, would that bring the config?
16:55 kkeithley The gluster updates in my repo, since 3.3.1-4 anyway, have only been to UFO. The core bits haven't changed. (Although YUM/rpm don't know that.)
16:57 kkeithley create volume definitely won't start it, but if you try to use the same bricks it's going to fall down due to that.
16:57 _br_ joined #gluster
16:58 kkeithley Did you create your bricks as subdirs or are they on the root of the brick's fs?
16:58 sashko joined #gluster
17:00 kkeithley If you used subdirs, you could rerun the command with a slightly different brick subdir paths and then compare the vol files.
17:03 _br_ joined #gluster
17:08 _br_ joined #gluster
17:09 Nicolas_Leonidas joined #gluster
17:09 Nicolas_Leonidas hi
17:09 glusterbot Nicolas_Leonidas: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an
17:09 glusterbot answer.
17:10 Nicolas_Leonidas hi , I'm trying to follow these instructions to install gluster on centos
17:11 Nicolas_Leonidas http://www.gluster.org/community/d​ocumentation/index.php/QuickStart
17:11 glusterbot <http://goo.gl/OEzZn> (at www.gluster.org)
17:11 sjoeboo_ joined #gluster
17:11 Nicolas_Leonidas but when I do sudo yum install glusterfs{-fuse,-server}
17:11 Nicolas_Leonidas I get two errors, http://download.gluster.org/pub/glust​er/glusterfs/3.3/3.3.1/EPEL.repo/epel​-latest/x86_64/repodata/repomd.xml: [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not Found"
17:11 glusterbot <http://goo.gl/NVQyN> (at download.gluster.org)
17:12 duerF joined #gluster
17:12 toruonu kkeithley: sadly ran on the root of the path, but I could use different mountpoint names (d35, d36 used so far, could use d33,d35 and then do a sed change throughout)
17:12 toruonu but I think it'll mean uuid etc changes as well that might screw things up majorly
17:14 _br_ joined #gluster
17:20 _br_ joined #gluster
17:20 toruonu btw, what holiday is it today in US?
17:21 toruonu ah, marthin luther king jr… googled it ;P
17:22 kkeithley MLKjr semi-holiday. Only banks, schools, government, and stock market closed..
17:22 elyograg no holiday for me.
17:25 Nicolas_Leonidas can anyone comment on the error message?
17:26 _br_ joined #gluster
17:29 _br_ joined #gluster
17:29 xmltok joined #gluster
17:30 toruonu Nicolas_Leonidas: maybe the latest link isn't there. Did you check that the file/dir exists using your browser?
17:31 Nicolas_Leonidas toruonu: yeah, it doesn't exist so 404
17:31 Nicolas_Leonidas I'm using Amazon AMI instances it's not exactly centos or redhat
17:31 kkeithley Nicolas_Leonidas: I'm not sure what epel-latest is or where it's coming from in your setup. If you're running CentOS6 it ought to be trying to get .../epel-6/x86_64/...
17:31 toruonu I downloaded the yum repo file and then manually changed
17:31 toruonu I use Scientific Linux so that didn't map either I think
17:32 toruonu changed it manually to 5 in my case
17:33 Nicolas_Leonidas Amazon Linux AMI
17:33 kkeithley Yes. The repo file has .../epel-$releasever/$basearch/... which ought to be filled in correctly by YUM.
17:34 Nicolas_Leonidas I can't seem to find instructions for Amazon Linux AMI, it's only redhat and fedora
17:35 kkeithley Okay, so what does /etc/redhat-release or /etc/*-release say it is?
17:35 _br_ joined #gluster
17:36 Nicolas_Leonidas kkeithley: Amazon Linux AMI release 2012.09
17:36 kkeithley brillian
17:36 kkeithley brilliant
17:36 Nicolas_Leonidas so? should I download the rpm file manually and install?
17:37 kkeithley If you believe it's RHEL6ish, then edit the repo file(s) and change epel-$releasever -> epel-6. Or download the RPMs and manually install them.
17:37 xmltok joined #gluster
17:38 Nicolas_Leonidas kkeithley: why not epel-7?
17:40 kkeithley googling tells me that Amazon Linux 2012.09 is somewhere between rhel5 and rhel6 beta. WRT to why not epel-7, at a minimum because rhel7 doesn't exist, those are there for anyone who might be using the rhel7 alphas and betas, which you don't have.
17:40 Nicolas_Leonidas right
17:40 Nicolas_Leonidas let me try that
17:41 _br_ joined #gluster
17:41 Nicolas_Leonidas yup got installed, many thanks brb
17:43 Nr18 joined #gluster
17:43 manik joined #gluster
17:44 Nr18 joined #gluster
17:47 _br_ joined #gluster
17:49 al joined #gluster
17:51 _br_ joined #gluster
17:52 raghu joined #gluster
17:56 kkeithley Nicolas_Leonidas: so which did you do, d/l the rpms and manually install, or edit the repo file to epel-5 or epel-6?
17:57 _br_ joined #gluster
17:59 chirino joined #gluster
17:59 VeggieMeat joined #gluster
17:59 arusso joined #gluster
18:02 _br_ joined #gluster
18:03 melanor9 joined #gluster
18:04 hateya joined #gluster
18:04 Norky joined #gluster
18:06 toruonu so … still no clue how to recover from my duplicated stuff?
18:06 toruonu I'm right now creating a new disk that I can use as temporary recluse to copy the files from the 4 replica subs together to reform the /home
18:06 toruonu but that's an option I'd rather have not used
18:08 kkeithley oh, I thought you were going to rerun the command with minor changes and use that as a guide to knitting your original vol files back into shape.
18:08 _br_ joined #gluster
18:09 kkeithley s/the command/the original command/
18:09 glusterbot What kkeithley meant to say was: If you used subdirs, you could rerun the original command with a slightly different brick subdir paths and then compare the vol files.
18:09 luis_alen left #gluster
18:11 VSpike I'm glad I didn't read much beyond the Gluster site itself before deploying it. Most google searches seem full of dire tales of terrible woe.
18:11 kkeithley no glusterbot, bad glusterbot.
18:12 kkeithley oh, I thought you were going to rerun the original command with minor changes and use the vol files from at as a guide to reconstructing your original vol files.
18:12 kkeithley s/from at/from that/
18:12 glusterbot What kkeithley meant to say was: oh, I thought you were going to rerun the original command with minor changes and use the vol files from that as a guide to reconstructing your original vol files.
18:14 glusterbot New news from resolvedglusterbugs: [Bug 876214] Gluster "healed" but client gets i/o error on file. <http://goo.gl/eFkPQ>
18:14 toruonu VSpike: are you sure you are glad? :) I'm currently in a mixed up fucked state that I'd rather not give a good recommendation if someone asked about gluster right now
18:22 jiffe98 is there a way to clear all locks?
18:24 kkeithley toruonu: you have six nodes, each with two backing volumes to make twelve bricks? What was the original command you used to make the volume?
18:25 sjoeboo_ joined #gluster
18:26 toruonu gluster volume create home0 replica 3  192.168.1.241:/d35 192.168.1.242:/d35 192.168.1.243:/d35 192.168.1.244:/d35 192.168.1.245:/d35 192.168.1.240:/d35 192.168.1.241:/d36 192.168.1.242:/d36 192.168.1.243:/d36 192.168.1.244:/d36 192.168.1.245:/d36 192.168.1.240:/d36
18:26 Norky joined #gluster
18:27 VSpike toruonu: Fair point, but in my case it's backing Wordpress with 2 replication nodes so as long as I make frequent backups I should be OK as I'm probably not stretching it. Also I get the impression that 3.3 is a big advance in reliability
18:27 toruonu VSpike: running 3.3.1
18:27 xmltok joined #gluster
18:28 VSpike Not saying it's bug free now, obviously - just that a lot of the online horror stories are probably out of date by now.
18:29 VSpike toruonu: I sympathize though. Horrible when its *your* crucial data that gets messed up.
18:30 toruonu creating a raid0 + drbd replica system right now… testing it first and then copying the files over… hope I'll have a /home for users tomorrow :/
18:32 tryggvil joined #gluster
18:33 andreask joined #gluster
18:34 chirino joined #gluster
18:52 xmltok joined #gluster
18:56 Nicolas_Leonidas kkeithley: edited the repo file
19:01 DaveS_ joined #gluster
19:01 maek joined #gluster
19:01 maek I have 3 boxes that need to share 1 dir. Is gluster right for this task?
19:02 kkeithley Nicolas_Leonidas: to rhel5 or rhel6
19:02 Nicolas_Leonidas kkeithley: 6
19:02 kkeithley thanks
19:05 DaveS joined #gluster
19:14 stevenlokie joined #gluster
19:15 stevenlokie Has anyone had issues with Gluster 3.2.6 and move brick going into unknown status?
19:16 koodough joined #gluster
19:17 DrVonNostren joined #gluster
19:19 DrVonNostren Hi folks, I am reading the documentation on iptables for gluster which says "Note: You need one open port, starting at 38465 and incrementing sequentially for each Gluster storage server, and one port, starting at 24009 for each bricks." Just want to verify you only need to open up the number of ports for the # of bricks on that particular system, and not # of bricks in total in the volume. Please confirm / deny. Thank you!
19:19 semiosis ~ports | DrVonNostren
19:19 glusterbot DrVonNostren: glusterd's management port is 24007/tcp and 24008/tcp if you use rdma. Bricks (glusterfsd) use 24009 & up. (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111.
19:23 Nicolas_Leonidas so gluster service is running on both servers, and peer probed, how do I share a folder with redancancy, and shared data between the two? do I use gluster volume create?
19:25 DrVonNostren Appreciated semiosis, but thats not much clearer. I am not using rdma. So on each system in  dist-rep volume w/ four servers and two bricks per server i should open up 24007-24011/tcp and 38465-34869/tcp as well as 111/tcp+udp?
19:31 RicardoSSP joined #gluster
19:31 RicardoSSP joined #gluster
19:32 toruonu joined #gluster
19:33 toruonu was away while driving home… any new gluster devels/admins around who could help me with a seriously hosed config that when attempting to remove bricks to reduce replica count from 3 to 2 instead reconfigured gluster to 2 replicas, but kept the bricks changing all replication groups and now I have multiple copies of all files and a total mess
19:33 kkeithley toruonu: I just sent you an email
19:34 toruonu thanks, saw it now
19:34 kkeithley Nicolas_Leonidas: yes, `gluster volume create $volname $node1name:/path/to/brick $node2name:/path/to/brick`
19:34 kkeithley Then "export" it with `gluster volume start $volname`
19:35 toruonu files of which 55k are listed >1 :/
19:35 toruonu argh half of message got eradicated
19:36 toruonu what worries me most is that I did a listing of all files on the four bricks that should make up unique copies of the four replica groups
19:36 toruonu out of 650k files ca 55k are listed >1
19:36 kkeithley DrVonNostren: You only need 38465-38468 open if you're using NFS.
19:39 kkeithley If you're not using rdma then you can omit opening 24008 per node you can probably get away with only opening 24007, 24009, and 24010 on each node.
19:39 maek left #gluster
19:40 DrVonNostren so if not using nfs, or rdma, in order for everything to go smoothly, i just need 24007,24009, and 24010 on my servers with bricks?
19:40 Nicolas_Leonidas kkeithley: so no need to create "briks" first/
19:40 Nicolas_Leonidas ?
19:42 kkeithley "Bricks" are just file systems.
19:42 Nicolas_Leonidas kkeithley: when I try to create volumes it says "$nodename not connected"
19:44 kkeithley But you probably want to create some just for the bricks, e.g. /dev/sdg, e.g. with xfs file system mounted someplace like /bricks/sdg.
19:44 kkeithley Then your volume create command would be `gluster volume create $volname $host1name:/bricks/sdg`
19:44 Nicolas_Leonidas does it have to be xfs? can't be ext4?
19:44 kkeithley ~ext4
19:44 kkeithley @ext4
19:44 glusterbot kkeithley: Read about the ext4 problem at http://goo.gl/PEBQU
19:45 kkeithley @ext4 | Nicolas_Leonidas
19:47 Nicolas_Leonidas kkeithley: I'll create xfs volumes then
19:47 kkeithley Nicolas_Leonidas: What's the real nodename? Substitute that where I used $nodename.
19:48 kkeithley Because I don't know what the names of your hosts are
19:48 Nicolas_Leonidas kkeithley: I did, it's something like www1.mydomain.com
19:48 kkeithley but you already told us that you'd peer probed the nodes.
19:49 kkeithley what does `gluster peer info` or `gluster peer status` show?
19:50 Nicolas_Leonidas yeah I was trying to replicate a folder, which is a mounted ext4 fs, now I'm trying to get rid of that folder and turn it into xfs
19:50 Nicolas_Leonidas it says, Number of peers 1, State: Accepted peer request (Disconnected)
19:51 kkeithley ??? what does `ping www1.mydomain.com` show?
19:51 kkeithley Do you have hostnames in /etc/hosts or DNS correctly configured?
19:52 Nicolas_Leonidas it shows the IP of that server
19:52 Nicolas_Leonidas DNS configured correctly
19:52 kkeithley Firewall disabled?
19:52 kkeithley firewall a.k.a. iptables
19:52 Nicolas_Leonidas kkeithley: yeah, allowed all TCPs
19:53 toruonu Nicolas_Leonidas: with XFS beware of data loss in case of power loss
19:53 toruonu http://toruonu.blogspot.co​m/2012/12/xfs-vs-ext4.html
19:53 glusterbot <http://goo.gl/5d4Jg> (at toruonu.blogspot.com)
19:53 kkeithley And when you did the `gluster peer probe www1.mydomain.com` what was the output
19:54 Nicolas_Leonidas toruonu: so what do you recommend? going back to ext4 and making gluster work with it? I'm trying to create a replicated pool so there is some redundancy
19:55 kkeithley you can use ext4, just be aware of the problem with certain kernels.
19:55 toruonu well if you have UPS and are sure it'll keep the systems from losing power abstractly, then use XFS, it's fast and good otherwise, but after losing a few hundred TB I'm too sensitive to it :)
19:55 kkeithley AFAIK, the kernel/ext4 issue is only in certain RHEL6/CentOS6 kernels
19:55 Nicolas_Leonidas toruonu: this is all happening on amazon aws
19:55 toruonu as long as your kernel is below 2.6.32 you should be ok I think, read about the joe julian page
19:56 toruonu ah, well … with amazon can't say, could be they have enough redundancy that you won't be bitten, but ...
20:02 VSpike Still curious if anyone has tried gluster on btrfs... but that's probably just asking for trouble
20:03 kkeithley the ext4 bug just makes glusterfs/glusterfsd go into infinite loop when you do a directory listing on a client, making it look like the server has locked up, but you're not going to lose any data.
20:03 toruonu has anyone tried running multiple parallel rsync-s from different sources against the same destination. How fucked up will it be :)
20:04 elyograg VSpike: I'll be giving it a try eventually on a testbed.  I had thought about trying it in production, but I've decided to play it safe.
20:04 VSpike I like the idea of being able to snapshot for backups, although LVM could give the same
20:04 kkeithley VSpike: I've used btrfs for bricks. If you're being conservative though you won't use that in production. AFAIK there's still no fsck for btrfs.
20:06 VSpike kkeithley: i think there *sort of* is, but it's hard to tell
20:06 elyograg VSpike: with LVM, you hve to leave part of your disk unallocated to get snapshots.  that's not a *problem* exactly, but if I don't give the pointy-haird bosses every byte they paid for, they won't be happy.
20:06 VSpike elyograg: true.
20:06 VSpike I saw an article about using gluster with ZFS - I suppose that's another alternative. Is it any better tested though?
20:07 VSpike I've no real experience with ZFS, other than having just started to use FreeNAS on a home-brew NAS
20:14 kkeithley semiosis, the-me: did you get a chance to try out the (revised) patch?
20:18 chirino joined #gluster
20:19 the-me kkeithley: I already sent it to our -release team for a review
20:19 the-me .. just to prevent that it will be rejected again :) thanks for your work!
20:21 kkeithley yw
20:22 the-me ?
20:22 kkeithley you're welcome
20:22 the-me ah :)
20:23 balunasj joined #gluster
20:31 monkey joined #gluster
20:41 semiosis kkeithley: i haven't yet
20:44 kkeithley semiosis: no prob
20:52 chirino joined #gluster
20:57 luis_alen joined #gluster
21:05 chirino joined #gluster
21:09 luis_alen Hello, guys. I have a question about replicated volumes: When a peer crashes and then comes back online, will its brick be available to the volume only after it's been healed by the self healing daemon?
21:13 semiosis healing is a file-by-file thing
21:13 luis_alen yes, I saw it when I ran volume heal <vol> info
21:13 semiosis clients should reconnect to the bricks right away, and when a file is accessed that's not in sync the client shoudl be routed to the "good" replica of that file while healing takes place
21:14 semiosis that's roughly how it goes
21:14 luis_alen cool
21:14 chirino joined #gluster
21:15 luis_alen so until the file is not in sync, the "good" replica will deliver it
21:16 luis_alen Also, I have another one: right after a crash, the client takes a lot of time (30 seconds or more) in order to be able to read the volume contents again. Is it normal? Is it tunnable?
21:16 semiosis yes that's the ,,(ping timeout)
21:16 glusterbot I do not know about 'ping timeout', but I do know about these similar topics: 'ping-timeout'
21:17 semiosis ,,(ping-timeout)
21:17 glusterbot The reason for the long (42 second) ping-timeout is because re-establishing fd's and locks can be a very expensive operation. Allowing a longer time to reestablish connections is logical, unless you have servers that frequently die.
21:20 toruonu semiosis: maybe you have ideas how to fix my screwed up gluster config back to health? I sent an e-mail to the gluster-users list today my day time as well subject Big problem
21:20 * semiosis checks ML
21:25 nightwalk joined #gluster
21:25 semiosis lsgvt?  never heard of that
21:25 toruonu right now gluster is shut down. I know for sure that bricks have weirdness on them because I'm collecting the first brick of all original replica groups and merging them to a raid volume on another server to get a working copy of the full thing
21:25 jjnash joined #gluster
21:25 toruonu lsgvt just visualizes the gluster volume info output
21:25 toruonu :)
21:26 toruonu right now out of ca 641k files I have 51k files that overlap on bricks which shouldn't have overlaps
21:26 toruonu in almost all cases one of the files is 0 size and the other is ok
21:26 toruonu in some cases both copies are identical
21:27 toruonu but I'm afraid to start gluster again until the wrong replica config is not fixed
21:27 semiosis the 0-len usually means a heal is pending but hasnt happened yet
21:27 toruonu right now it assumes 6x2 config while it should be 4x3
21:27 semiosis glusterfs heals the directory which causes entries to be created (zero length) then comes back later and heals the files
21:27 toruonu the remove-brick replica=2 command changed config and then started to decommission wrongly
21:28 toruonu I guess in retrospect I shouldn't have used the start command
21:28 toruonu just went with the warning of losing data possibly and it'd have reconfigured and dropped bricks
21:28 semiosis in retrospect lesson is to try it on a test volume before doing it on the real thing
21:28 toruonu that too :)
21:28 toruonu but right now I'd really prefer to get it back up, healthy and then contemplate next moves
21:30 toruonu I myself see a few options: 1) change the whole thing back to 4x3 by hacking config files (would prefer tools) 2) merge the original data to a separate drive and reinit from there (doing right now in background, ca 60GB/450GB done), 3) hack config to get to the right 4x2 config with some replica bricks that I planned to remove already removed
21:31 daMaestro joined #gluster
21:31 semiosis i think option 2 is the best of those, followed by then copying the merged data into a new gluster volume of desired layout
21:31 semiosis one caveat would be just how to merge without overwriting real files with those zero-len placeholders
21:32 toruonu rsync -abuW --min-size 1 --exclude="\.gluster" --stats --progress
21:32 semiosis nice
21:32 toruonu the trouble with option 2 is that it lengthens the downtime until the merging is complete :)
21:33 toruonu then again I'm not sure I'd want to start gluster while rsync is ongoing either
21:33 toruonu so probably no difference at this point
21:34 toruonu but I have to say, what happened should be somehow addressed
21:34 semiosis i would set background self heal count to 1, wait for heals to slow down, then stop volume before rsyncing data out
21:34 toruonu the command should warn that changing replication factor and using start will change replica groups and mess up things while starting to balance stuff
21:34 toruonu well the volume is stopped right now
21:34 toruonu has been for hours
21:34 semiosis oh
21:34 semiosis ok
21:34 toruonu I wanted to stop it fucking up stuff :)
21:35 semiosis yeah
21:35 melanor91 joined #gluster
21:36 zaitcev joined #gluster
21:37 toruonu so … should I file a bug/change request? I think the start command should not be allowed with replica change command as it will fuck up the structure
21:37 glusterbot http://goo.gl/UUuCq
21:37 toruonu glusterbot thinks so :p
21:39 partner joined #gluster
21:45 toruonu btw how do others work past the negative lookup cache issue that's with native mount
21:46 toruonu wouldn't running VM's on gluster cause the same issue as with users issues that various library paths are always configured in excess and running through them will slow down stuff
21:46 toruonu due to excessive lookups
21:48 semiosis optimize the path list... thats what you do for php
21:50 stevenlokie working on Gluster 3.2.6 - a move went bad and stalled both machines - since then I'm unable to act on the moves for this - can't get a status, abort, or restart this, any ideas on how to correct?
21:54 toruonu semiosis: that's not always an option if you don't fully control the application
21:54 toruonu we run a lot of applications from CVMFS etc that are repository packed elsewhere and define the environment themselves. That means that we cannot optimize this, we can only host it
21:55 toruonu and doing it for tens if not hundreds of applications would not be feasible anyway especially considering how easy the fix would be if only it would be implemented :)
21:56 toruonu nfs has it, but I've had too many bad experiences with glusterfs nfs with lockups and you do lose the benefit of connecting to one node only at startup and after that the communication is done with all of them allowing independent gluster nodes to be restarted for what ever reasons while work continues
21:57 sjoeboo_ joined #gluster
21:58 semiosis toruonu: have you seen https://github.com/jdarcy/negative-lookup ?
21:58 glusterbot Title: jdarcy/negative-lookup · GitHub (at github.com)
21:59 semiosis idk what the current status of that is in regards to the latest glusterfs
21:59 semiosis jdarcy: you around?
22:02 toruonu afaik the status is that git has updated to 3.3 compatibility, but there is no guide how to implement that in an rpm based installation. From the tutorial that I've read (to be fair I do not have plans to start writing translators myself) you do need at least a partial checkout of the git tree of whole gluster to be able to compile the translator and the whole implementation part is a bit vague. I don't argue that it might be a great tutori
22:02 toruonu al for translator writing, but not for trying this particular addon in production :)
22:02 luis_alen left #gluster
22:04 toruonu also I think it doesn't have cache expiration etc and might have issues with parallel usage
22:13 jjnash left #gluster
22:13 chirino joined #gluster
22:28 Nicolas_Leonidas ok I successfully created a replica volume and started it, using gluster volume create rimagesvolume replica 2 newwww3.domain.com:/r_images newwww3.domain.com:/r_images
22:29 Nicolas_Leonidas sorry that is gluster volume create rimagesvolume replica 2 newwww3.domain.com:/r_images newwww1.domain.com:/r_images
22:29 Nicolas_Leonidas but when I create a file on newwww3.domain.com:/r_images I can't see the same file on newwww1.domain.com:/r_images
22:29 Nicolas_Leonidas should I be able to do that? am I missing something?
22:36 _ilbot joined #gluster
22:36 Topic for #gluster is now  Gluster Community - http://gluster.org | Q&A - http://community.gluster.org/ | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - http://irclog.perlgeek.de/gluster/
22:36 toruonu so basically you mount -t glusterfs some.gluster.server:rimagesvolume /r_images
22:36 toruonu and then when you write any files there those will appear in newwww3.domain.com:/r_images etc
22:36 toruonu but you shouldn't go and touch the bricks themselves unless to fix some split-brain etc
22:36 Nicolas_Leonidas I seee
22:37 toruonu take the bricks as block devices in a mirror, you don't normally mount those directly and access them directly. You use the final mirror device (the meta device)
22:38 toruonu though in the case of gluster the bricks are lying on usual filesystem so you COULD touch the files straight up, but then you are operating on them outside of gluster knowledge
22:38 toruonu though not sure what gluster does when it finds files there without the respective .gluster/... guid etc attached
22:39 toruonu ok guys … am off to sleep, it's nearly 1 AM and I've got 177GB / 453 GB rsynced… it should be done by morning
22:40 Nicolas_Leonidas thanks toruonu
23:15 sjoeboo_ joined #gluster
23:15 jim` joined #gluster
23:16 DrVonNostren Hi everybody, I have created a 6 x 2 = 12 distributed replicated cluster out of 750GB xfs bricks, however, when mounted on my client (using gluster native) it only shows up as a size of 3.8T when I believe I should be getting 4.5T, can anyone help me shed some light on this?
23:18 H__ what's the available space on those 750GB bricks ?
23:20 DrVonNostren 750G
23:23 lorderr joined #gluster
23:23 aliguori joined #gluster
23:27 hateya joined #gluster
23:30 raven-np joined #gluster
23:36 melanor9 joined #gluster
23:39 DrVonNostren Hi everybody, I have created a 6 x 2 = 12 distributed replicated cluster out of 750GB xfs bricks, however, when mounted on my client (using gluster native) it only shows up as a size of 3.8T when I believe I should be getting 4.5T, can anyone help me shed some light on this?
23:48 jiffe98 DrVonNostren: local filesystem overhead?
23:49 DrVonNostren 700 gigs of it?
23:51 DrVonNostren df -h  on the servers shows each brick as 750GB, with only 33MB usage (I am presuming that 33MB is filesystem overhead) so 33MB x 12 ~ 400MB of filesystem overhead
23:51 DrVonNostren jiffe98: correct me if wrong
23:52 jiffe98 DrVonNostren: 750GB under Size or under Avail when you df -h?
23:52 DrVonNostren actually both
23:52 jiffe98 what fs?
23:52 DrVonNostren xfs throughout
23:54 jiffe98 I haven't played much with xfs so I'm not sure what kind of overhead it has but fs overhead is what takes up all of my difference otherwise the numbers match
23:55 DrVonNostren appreciate the help jiffe98 I have to run though

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary