Camelia, the Perl 6 bug

IRC log for #gluster, 2013-02-14

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:10 raven-np joined #gluster
00:12 hagarth joined #gluster
00:16 plarsen joined #gluster
00:28 sjoeboo_ joined #gluster
00:39 mkultras joined #gluster
00:45 sjoeboo_ joined #gluster
00:59 al joined #gluster
01:28 sjoeboo_ joined #gluster
01:38 mooperd_ joined #gluster
01:51 ultrabizweb joined #gluster
02:04 aliguori joined #gluster
02:09 ultrabizweb joined #gluster
02:40 sjoeboo_ joined #gluster
02:53 raven-np joined #gluster
02:56 mkultras joined #gluster
03:02 pipopopo_ joined #gluster
03:09 mkultras hey i have 2 ubuntu 12 servers that had 3.2.5 glusterfs-server on them and am upgrading to 3.3 using the repo,so i ran glusterd --xlator-option *.upgrade=on -N, started the daemons, ran peer probe, volume info said i have no volumes anymore, apparently my storage wasnt mounted at the time i ran upgrade so i tried to just make the volume again but when i run  gluster volume create kalturastorage
03:09 mkultras replica 2 transport tcp glusternode1:/storage glusternode2:/storage i get /storage or a prefix of it is already part of a volume
03:09 glusterbot mkultras: To clear that error, follow the instructions at http://goo.gl/YUzrh or see this bug http://goo.gl/YZi8Y
03:09 pipopopo joined #gluster
03:09 mkultras ohyeah, i read that before. i get it
03:16 bharata joined #gluster
03:19 lala joined #gluster
03:20 lala__ joined #gluster
03:24 mkultras ah ya its still hanging with the self heal failing, nice i have a self heal command though now , thats cool
03:24 mkultras i have this problem -> http://comments.gmane.org/gmane.co​mp.file-systems.gluster.user/10709
03:24 glusterbot <http://goo.gl/Em6SX> (at comments.gmane.org)
03:24 mkultras like exactly
03:27 lala_ joined #gluster
03:42 lala__ joined #gluster
03:42 _NiC joined #gluster
03:44 lala joined #gluster
03:50 mkultras joined #gluster
03:56 nhm joined #gluster
03:59 RobertLaptop joined #gluster
04:06 sgowda joined #gluster
04:06 overclk joined #gluster
04:09 rastar joined #gluster
04:16 pai joined #gluster
04:26 jmara joined #gluster
04:31 sahina joined #gluster
04:43 vpshastry joined #gluster
04:44 lala joined #gluster
04:48 mooperd joined #gluster
04:50 sripathi joined #gluster
04:58 overclk joined #gluster
04:59 shylesh joined #gluster
05:13 hagarth joined #gluster
05:27 raven-np joined #gluster
05:29 bala1 joined #gluster
05:32 sripathi joined #gluster
05:32 deepakcs joined #gluster
05:34 tjikkun_ joined #gluster
05:40 an joined #gluster
05:41 ramkrsna joined #gluster
05:41 ramkrsna joined #gluster
05:42 sripathi joined #gluster
05:44 rastar joined #gluster
06:01 glusterbot New news from resolvedglusterbugs: [Bug 905871] Geo-rep status says OK , doesn't sync even a single file from the master. <http://goo.gl/CpA33>
06:04 cjohnston_work joined #gluster
06:08 satheesh joined #gluster
06:12 shylesh joined #gluster
06:26 sgowda joined #gluster
06:31 an joined #gluster
06:41 sripathi joined #gluster
06:44 sripathi joined #gluster
06:45 rgustafs joined #gluster
06:51 an joined #gluster
06:59 ricky-ticky joined #gluster
07:01 vimal joined #gluster
07:01 Ryan_Lane joined #gluster
07:05 sgowda joined #gluster
07:05 Nevan joined #gluster
07:06 raven-np joined #gluster
07:07 raghu joined #gluster
07:08 Guest77058 joined #gluster
07:08 lala__ joined #gluster
07:18 thtanner joined #gluster
07:26 cw joined #gluster
07:29 ekuric joined #gluster
07:33 ctria joined #gluster
07:33 abyss^_ Is there any list of gluster tuning settings (with descriptions)? Like cache etc?
07:45 guigui3 joined #gluster
07:46 DeltaF joined #gluster
07:46 DeltaF Hi folks. Anyone still awake?
07:48 an joined #gluster
07:55 dobber joined #gluster
08:02 abyss^_ ok, I'm blind, all is in admin guide ;)
08:03 vpshastry left #gluster
08:10 mgebbe joined #gluster
08:11 mgebbe_ joined #gluster
08:13 cw Hi; '[2013-02-14 09:13:29.259406] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"'
08:14 cw is this a critical or just FYI message from gluster? and how do I make it stop :)
08:14 cw everything seems to work just fine with sync etc
08:16 cw both my volumes have 'Transport-type: tcp' in their status output
08:17 rastar joined #gluster
08:18 DeltaF sounds fun.
08:20 lala__ joined #gluster
08:22 sripathi joined #gluster
08:35 sripathi1 joined #gluster
08:40 andrei__ joined #gluster
08:40 ngoswami joined #gluster
08:48 ekuric joined #gluster
08:59 overclk joined #gluster
09:00 Staples84 joined #gluster
09:03 gbrand_ joined #gluster
09:03 gbrand__ joined #gluster
09:04 gbrand_ joined #gluster
09:15 ctria joined #gluster
09:16 lala__ joined #gluster
09:18 sripathi joined #gluster
09:30 ramkrsna joined #gluster
09:30 ekuric joined #gluster
09:30 satheesh joined #gluster
09:31 hagarth joined #gluster
09:34 ekuric joined #gluster
09:35 venkat joined #gluster
09:41 sripathi1 joined #gluster
09:45 glusterbot New news from newglusterbugs: [Bug 902953] Clients return ENOTCONN or EINVAL after restarting brick servers in quick succession <http://goo.gl/YhZf5>
10:01 lala__ joined #gluster
10:07 bauruine joined #gluster
10:22 andrei_ joined #gluster
10:32 sgowda joined #gluster
10:34 ctria joined #gluster
10:43 rastar joined #gluster
10:45 lala__ joined #gluster
10:46 lala joined #gluster
10:52 pkoro joined #gluster
10:54 eseyman1 joined #gluster
10:54 andrei_ joined #gluster
10:54 eseyman1 left #gluster
11:02 davidbomba joined #gluster
11:02 davidbomba morning!
11:02 turbo124 Quick Q for my Gluster peeps.
11:03 turbo124 I am trying to geo-replicate some Virtual Machines, however during the replication process, the VMs are getting put into Read-Only mode file system, i presume due to the file locking being imposed by rsync, i haven't seen any literature that this occurs... just needed confirmation that this is expected behaviour?
11:10 lh joined #gluster
11:11 hagarth joined #gluster
11:13 Maledictus joined #gluster
11:23 tjikkun_work joined #gluster
11:31 duerF joined #gluster
11:34 sripathi joined #gluster
11:39 venkat joined #gluster
11:40 andrei_ joined #gluster
11:43 andrei_ joined #gluster
11:45 purpleidea joined #gluster
11:45 purpleidea joined #gluster
11:50 andrei_ joined #gluster
11:52 andrei_ joined #gluster
11:56 sripathi joined #gluster
11:57 gbrand_ joined #gluster
12:01 Norky joined #gluster
12:01 Norky good morning
12:02 satheesh joined #gluster
12:02 Norky hmm, no telling off from glusterbot... I'm disappointed ;)
12:02 kkeithley1 joined #gluster
12:05 shireesh joined #gluster
12:07 Norky is a long listing (ls -l) of a directory that contains many (>10,000) files expected to take much longer on gluster than a local filesystem?
12:08 Norky local copy takes 0.3s, over gluster-FUSE it takes >10s
12:08 Norky and this is with -n to obviate any uid/gid lookups
12:11 Norky also, I have five volumes. One of them is not being made available over NFS from any of the gluster servers
12:16 Norky http://fpaste.org/LLBB/ note this only affects one volume. The others are fine.
12:16 glusterbot Title: Viewing specific gluster volume not available via NFS by Norky (at fpaste.org)
12:20 NeonLicht Is there any way to force stoping and deleting a volume, please?  I've created one for testing and now I can't neither remove-brick nor stop nor delete the volume.
12:21 xian1 joined #gluster
12:25 bulde joined #gluster
12:36 luis_alen joined #gluster
12:39 manik joined #gluster
12:42 mooperd joined #gluster
12:45 luis_alen VSpike: Hello, do you remember the question I asked here yesterday? It was about high cpu usage on the gluster client of a volume that only stores static content for a web server. You mentioned that it could be caused by apache stat() calls, right? Well, can you tell me more about how gluster handles those stat() calls? I'm wondering if nfs with fscache would deliver better performance for my scenario…
12:46 hagarth joined #gluster
12:47 luis_alen VSpike: I figured that the php app also does a lot of stat() calls in order to check for the existence of some static content… In case a stat() call is performance killing for gluster, I'll really need to put gluster aside on that...
12:52 shireesh joined #gluster
12:54 jclift_ joined #gluster
13:18 x4rlos When i try and remove a peer (cos i have chnaged my address ranges) i get this:
13:18 x4rlos root@client1:/mnt# gluster peer detach dev
13:18 x4rlos One of the peers is probably down. Check with 'peer status'.
13:20 x4rlos shouldn't i be able to force it to detach somehow?
13:21 awickham joined #gluster
13:21 x4rlos (to be fair, i use dns, so when i update, it will work after i change the attributes for access - but this is a quick test to see how i can detatch the peer).
13:21 x4rlos Or do i have to change the volume from replica2? .... oh. I may have just answered myself.
13:21 Humble joined #gluster
13:24 partner luis_alen: stat triggers always self-heal for that particular file. also php does a lot of things that make the performance die
13:25 luis_alen partner: the php app itself is not running on top of gluster
13:26 partner but it accesses the files alot in gluster?
13:26 partner luis_alen: have you read this already? http://joejulian.name/blog/optimizi​ng-web-performance-with-glusterfs/
13:26 glusterbot <http://goo.gl/uDFgg> (at joejulian.name)
13:27 luis_alen partner: yeah, it does. It has a caching system that checks for the existence of these static files. If they're not available on the cache, the app creates it there.
13:27 luis_alen partner: but for each of these checks, there's a stat() call
13:28 luis_alen partner: Yeah, i've read it
13:28 partner uh, that kind of kills the whole cache with gluster, doesn't it?
13:29 partner what about using it over nfs, it would cache there in between stats and stuff
13:29 luis_alen partner: that's what I'm wondering
13:30 partner http://joejulian.name/blog/nfs-mount-for-glusterf​s-gives-better-read-performance-for-small-files/
13:30 glusterbot <http://goo.gl/5IS4e> (at joejulian.name)
13:30 luis_alen partner: however, I see that there's no support for acls. Is that correct?
13:35 partner luis_alen: i recall reading somewhere it was under code-review currently
13:35 partner so probably not available just yet
13:36 partner but better to wait for authoritative answer for that
13:37 manik joined #gluster
13:41 edward1 joined #gluster
13:44 raven-np joined #gluster
13:45 plarsen joined #gluster
13:48 mooperd_ joined #gluster
13:54 mooperd joined #gluster
14:03 samppah hmmmhsdh
14:07 samppah vm images stored on gluster 3.4 seem to be very very fast.. but i'm still bit unsure if it's using qemu instead of fuse
14:07 NeonLicht Is there any way to force stoping and deleting a volume, please?  I've created one for testing and now I can't neither remove-brick nor stop nor delete the volume.
14:08 samppah NeonLicht: gluster volume stop volname force ?
14:10 partner from the very command line utility by issuing "gluster help":
14:10 partner volume stop <VOLNAME> [force] - stop volume specified by <VOLNAME>
14:10 partner as samppah already stated..
14:10 Staples84 joined #gluster
14:11 partner (imo it was stupid to remove a man page..)
14:11 partner thought i understand its better not to have outdated docs around..
14:12 ramkrsna joined #gluster
14:13 NeonLicht hank you, partner, although that doesn't help.  I still get 'Stopping volume glutest has been unsuccessful'-
14:15 NeonLicht It's driving me crazy.  I can't do anything at all.
14:16 glusterbot New news from newglusterbugs: [Bug 911160] Following a change of node UUID I can still reconnect to the volume <http://goo.gl/gdN37>
14:18 mooperd_ joined #gluster
14:18 NeonLicht Perhaps removing glusterfs and purging it's config files in every server will help me get rid of that volume?
14:18 nueces joined #gluster
14:22 partner NeonLicht: i don't have any details of your case and i need to go anyways so stick around, the pros will appear probably shortly..
14:22 NeonLicht Thank you, partner.
14:28 jack joined #gluster
14:28 dustint joined #gluster
14:30 VSpike luis_alen: thanks for the update - i'm new to gluster too, but I'd heard the same - that stat must check every node due to lack of centralised metadata, and that NFS is better in performance terms (but possibly also more flaky, and you lose the HA and load balancing unless you implement it yourself
14:32 luis_alen VSpike: loosing easy HA and load balancing concerns me, but it's worth trying.
14:33 luis_alen Vspike: why does it need to check every node on a stat call?
14:41 shireesh joined #gluster
14:42 venkat joined #gluster
14:47 VSpike luis_alen: I'm not sure :)
14:50 NeonLicht It did not work.  :-(
14:53 shireesh2 joined #gluster
14:55 aliguori joined #gluster
15:00 bdperkin joined #gluster
15:03 awickham so i'm having an issue attaching bricks to a replicated volume. it returns host not connected. i run a peer status and the node shows as connected. i check the logs and it doesn't appear to show any errors...am i missing something?
15:04 stopbit joined #gluster
15:09 Staples84 joined #gluster
15:15 Maledictus in gluster volume status I see some hosts as IP, can I change that to their hostname?
15:15 ndevos ~hostnames | Maledictus
15:15 glusterbot Maledictus: Hostnames can be used instead of IPs for server (peer) addresses. To update an existing peer's address from IP to hostname, just probe it by name from any other peer. When creating a new pool, probe all other servers by name from the first, then probe the first by name from just one of the others.
15:17 bugs_ joined #gluster
15:17 Maledictus ndevos, glusterbot: thanks :)
15:17 hagarth joined #gluster
15:17 ndevos :)
15:19 Maledictus nice, now it looks much better. any way to have the output of, say, volume rebalance status sorted by hostname?
15:22 NeonLicht Is there any way to force stoping and deleting a volume, please?  I've created one for testing and now I can't neither remove-brick nor stop nor delete the volume.  volume stop <VOLNAME> [force]   fails:   'Stopping volume glutest has been unsuccessful'.
15:25 jskinner_ joined #gluster
15:29 DeltaF joined #gluster
15:29 ndevos Maledictus: that got introduced with http://review.gluster.org/4416 which seems to be included in 3.4
15:29 glusterbot Title: Gerrit Code Review (at review.gluster.org)
15:30 Maledictus ndevos: thanks again :)
15:31 ndevos you're welcome
15:32 bdperkin joined #gluster
15:34 Maledictus Anyone else seeing the fuse client being way slower than the nfs client? I'm doing 600-800 http requests over apache to a NFS share with static, small files. While fuse will do about 70 req/s. I thought the fuse client would be faster?
15:34 Maledictus Also with dd NFS is much faster than fuse for a single stream
15:36 ndevos http://joejulian.name/blog/nfs-mount-for-gluster​fs-gives-better-read-performance-for-small-files was given earlier today already :)
15:36 glusterbot <http://goo.gl/pva1b> (at joejulian.name)
15:40 Maledictus thanks for the third time :)
15:40 chouchins joined #gluster
15:43 ndevos I think thats enough helping for me for today :)
15:48 Maledictus insert coin ;)
16:00 daMaestro joined #gluster
16:04 z00mz00m joined #gluster
16:04 z00mz00m hi i have a question, I have a distributed volume setup with 1 brick, but as time has gone on the .glusterfs directory continues to grow even when files are deleted.  is this by design?
16:07 z00mz00m [root@host1 dir1]# du -skh dst
16:07 z00mz00m 99G     dst
16:07 z00mz00m [root@host1 dir1]# cd dst
16:07 z00mz00m [root@host1 dst]# du -skh *
16:07 z00mz00m 1.2G    file1
16:07 z00mz00m 2.4G    file2
16:07 z00mz00m 4.4G    file3
16:07 z00mz00m 108M    file4
16:07 z00mz00m [root@host1 dst]# du -skh .glusterfs/
16:07 z00mz00m 99G     .glusterfs/
16:07 z00mz00m was kicked by glusterbot: message flood detected
16:07 z00mz00m joined #gluster
16:07 z00mz00m err sorry
16:08 NeonLicht Where is that .glusterfs/ dire located, z00mz00m?
16:08 z00mz00m http://pastebin.com/CsttR4Lp
16:08 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
16:09 z00mz00m its located in the root of the brick
16:09 Shdwdrgn joined #gluster
16:09 z00mz00m http://dpaste.org/2Bhxq/
16:09 glusterbot Title: dpaste.de: Snippet #219227 (at dpaste.org)
16:10 NeonLicht I cannot find it in any brick, maybe that's why I have so many problems.
16:10 z00mz00m its created by glusterfsd and used for parity against replicated drives
16:10 z00mz00m im not even sure why it exists on a distributed with 1 brick, but i guess incase i had more
16:10 z00mz00m but its strange, when i delete stuff from the drive it should be deleted out of there too
16:11 NeonLicht I see.
16:11 z00mz00m i have another gluster volume the same thing is happening on, same setup, distributed with 1 brick
16:12 z00mz00m when i delete stuff off that volume, the space doesnt actually get free'd up
16:12 NeonLicht What's the logic of having distributed with 1 brick?
16:12 z00mz00m if i want to expand
16:12 z00mz00m and it allows me to network mount
16:12 NeonLicht Oh, I see.
16:13 NeonLicht I made a volume with replica 2 and now I'm stuck with it, I can't do anything, not even stop it.
16:13 z00mz00m my 2 replicated gluster volumes seem to work perfectly as the files fluctuate in size
16:15 NeonLicht I ven uninstalled and purged glusterfs of all of the servers and, when I install glusterfs again, the volume is still there, with all the same problems as before.
16:15 NeonLicht s/ven/even/
16:15 glusterbot NeonLicht: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
16:16 z00mz00m ports arent blocked by software firewall is it?
16:16 NeonLicht In my case? No, thery aren't.
16:17 z00mz00m when you deleted the volume to recreate, did you unset the gluster attrs?
16:17 NeonLicht But I can't see any .glusterfs directories.
16:17 NeonLicht I can't delete the volume, I wish I could.
16:18 NeonLicht What gluster attrs?  Where can I read about that?
16:18 semiosis @qa releases
16:19 glusterbot semiosis: The QA releases are available at http://bits.gluster.com/pub/gluster/glusterfs/ -- RPMs in the version folders and source archives for all versions under src/
16:19 z00mz00m http://www.gluster.org/community/d​ocumentation/index.php/Arch/A_Newb​ie%27s_Guide_to_Gluster_Internals
16:19 glusterbot <http://goo.gl/3ntVd> (at www.gluster.org)
16:19 z00mz00m xattrs
16:19 semiosis the path or a prefix of it is already part of a volume
16:19 glusterbot semiosis: To clear that error, follow the instructions at http://goo.gl/YUzrh or see this bug http://goo.gl/YZi8Y
16:19 NeonLicht Thank you, z00mz00m.
16:20 z00mz00m np
16:20 z00mz00m 1.3T.glusterfs/
16:20 z00mz00m lol.
16:20 z00mz00m why does it do thiisss
16:21 NeonLicht Man, that grows really fast.
16:22 z00mz00m its a 2tb fs, few months old, did some big copies to test and sure enough... when i delete the old copy the space isnt freed from the fs
16:23 z00mz00m the first one, i rm -rf'd the directory yet 99gig is still in use
16:23 z00mz00m because of .glusterfs
16:23 z00mz00m im using the latest stable in centos, 3.3.1
16:26 z00mz00m guess i could just rm -rf the contents of .glusterfs but that seems crude
16:28 thtanner left #gluster
16:28 luckybambu joined #gluster
16:33 NeonLicht Do you also have such hidden directoy in the replicated volumes?
16:34 z00mz00m yeah, but it seems to remain the same as the space of the volumes
16:34 z00mz00m so its only hte distibuted volumes that are holding space open on file delete
16:35 DeltaF It's not a terrible idea if I need to keep adding/removing bricks in a replicated volume for a (manually) scaled cluster, right?
16:46 bala joined #gluster
16:46 Shdwdrgn joined #gluster
16:54 aliguori joined #gluster
16:59 an joined #gluster
17:01 NeonLicht Why do you want/need to do so, DeltaF?
17:03 JoeJulian z00mz00m: Was the file you deleted a hardlink?
17:04 aswickham joined #gluster
17:08 mkultras i have a folder on my gluster mount with 8000 files in it
17:08 mkultras i have kaltura trying to list the files in this folder and its timing out
17:08 DeltaF NeonLicht: I'm trying to setup a load balanced set of web servers. I figured this was the easiest way to sync the file system across. Since it will be holding lots of PHP files, I plan to use the NFS mount method.
17:09 mkultras i noticed if i ls once it takes like 2  minutes the first time then 8 seconds the second time
17:09 mkultras i need to look at speeding this up if possible
17:09 DeltaF I only need multiple servers for a month at most. After that, it would be back to a volume with 1 brick (main server)
17:10 lala joined #gluster
17:14 NeonLicht I see, DeltaF.
17:15 DeltaF I'm hoping the NFS performance would be close to "native" FS (if not better.) Of course, it's on AWS/EC2 so native is still somewhat virtual.
17:15 z00mz00m when you're running 1 brick and you delete files, do you get the space back?
17:21 NeonLicht I did, z00mz00m, and I did not have a hidden dir.
17:22 DeltaF Hoping to implement and test today. whee.
17:25 cyberbootje joined #gluster
17:26 aswickham so i'm having an issue attaching bricks to a replicated volume. it returns host not connected. i run a peer status and the node shows as connected. i check the logs and it doesn't appear to show any errors...any ideas?
17:28 lala_ joined #gluster
17:31 JoeJulian mkultras: Perhaps a rebalance might help. See http://joejulian.name/blog​/dht-misses-are-expensive/
17:31 glusterbot <http://goo.gl/A3mCk> (at joejulian.name)
17:31 NeonLicht Is there any way to force stoping and deleting a volume, please?  I've created one for testing and now I can't neither remove-brick nor stop nor delete the volume.  volume stop <VOLNAME> [force]   fails:   'Stopping volume glutest has been unsuccessful'.
17:32 JoeJulian mkultras: But directory listings are known to be slow. Can you tree those files out a little so your directories are smaller?
17:33 JoeJulian NeonLicht: Did you check your logs to see why it's unsuccessful? Usually it's because a peer doesn't have glusterd running.
17:34 NeonLicht I try, JoeJulian, but I don't seem to find (understand) what's relevant,
17:34 JoeJulian NeonLicht: the brute-force approach is to stop glusterd on your servers, and delete the volume directory from below /var/lib/glusterd/vols
17:34 JoeJulian NeonLicht: Have you posted the log to [fd]paste so someone can take a look?
17:35 NeonLicht Thank you, JoeJulian, i've tried uninstalling and purging to no avail before, I'll try deleting the vols.
17:35 JoeJulian uninstalling doesn't delete the state.
17:35 JoeJulian I don't know what you mean by purging.
17:35 NeonLicht No, JoeJulian, I haven't, since I can't seem to be able to recognise what's the relevant infdormation on them.
17:35 NeonLicht JoeJulian, I mean purging the configuration files (apt-get remove --purge glusterfs-*).
17:36 JoeJulian state files != configuration files
17:36 NeonLicht I see, JoeJulian, I've found that vols dir you refered to... I'm going to delete it for good.  Thank you,
17:37 JoeJulian If you're trying to wipe out the configuration entirely, stop glusterd and rm -rf /var/lib/glusterd/*
17:40 JoeJulian aswickham: Could you fpaste the peer status, volume status, and the command that's returning the error?
17:41 DeltaF JoeJulian: Web front-ends mounting the web dir via replicated volume and accessed via NFS. Sound OK? Should I have serious issues adding/removing hosts? Performance?
17:42 NeonLicht Thank you, JoeJulian, now I really got rid of the broken volume!  I can start over testing again.  :-)
17:42 JoeJulian You're welcome
17:42 Shdwdrgn joined #gluster
17:42 ctria joined #gluster
17:43 JoeJulian DeltaF: So each web server is a replica of the same data and you're nfs mounting locally?
17:44 DeltaF Yes. NFS is to avoid the PHP/stat performance issues.
17:44 JoeJulian That breaks both the http://joejulian.name/blog/glust​erfs-replication-dos-and-donts/ philosophies and the ,,(php) one.
17:44 glusterbot <http://goo.gl/B8xEB> (at joejulian.name)
17:44 glusterbot http: php calls the stat() system call for every include. This triggers a self-heal check which makes most php software slow as they include hundreds of small files. See http://goo.gl/uDFgg for details.
17:44 glusterbot php calls the stat() system call for every include. This triggers a self-heal check which makes most php software slow as they include hundreds of small files. See http://goo.gl/uDFgg for details.
17:45 an joined #gluster
17:46 JoeJulian Performance is really going to suck as you're using up write bandwidth writing the same file to 4000 servers per data center (if you're ebay) and self-heal checking them all as well.
17:46 DeltaF If I were ebay I would have a dedicated cluster.
17:46 JoeJulian And I always like to plan for success.
17:46 DeltaF I'm talking on the scale of 5-10 servers. Closer to 5.
17:46 DeltaF And only for the next few weeks
17:47 DeltaF 11 months out of the year it's running on a single server.
17:47 DeltaF 11.5 months really.
17:47 JoeJulian My overriding philosophy is that you know your systems and your requirements and should do what you think is best for your application.
17:47 bluefoxxx joined #gluster
17:47 bluefoxxx my gluster replicant pair is not automatic self-healing :(
17:48 DeltaF OK. I read that post last night, plus some others. The 1-minute NFS cache is an acceptable risk. Much better than the 5 minute rsync the last guy had
17:48 DeltaF It's 99% reads and much is already offloaded to CDN.
17:50 DeltaF I know I'm in that odd spot where I need the shared file access + redundancy but not enough to build it in a modular fashion.
17:51 DeltaF lmk if you think I'm off base in any particular point. :)
17:54 bluefoxxx oh self-heal happens on access ok
17:55 aswickham http://fpaste.org/98Sv/
17:55 glusterbot Title: Viewing Paste #277486 by aswickham (at fpaste.org)
17:55 cyberbootje1 joined #gluster
18:01 rubbs joined #gluster
18:01 JoeJulian aswickham: Interesting... how about the /var/log/glusterfs/{cli.log,et​c-glusterfs-glusterd.vol.log}
18:03 DeltaF ok then. thanks for trying to talk some sense into me.
18:03 JoeJulian Hehe, any time.
18:04 rubbs I'm sort of a FS newb, and was curious as to if there were some good reasons why to use XFS vs EXT4. Does anyone have any lit I can read on when to use which?
18:04 NeonLicht Should it be possible to mount a volume from any of the servers or only from the one you created it?  I think the former, but maybe I have missinterpret it and I'm trying something impossible?
18:04 DeltaF JoeJulian: do you think it won't work, or you just think it's not ideal?
18:05 JoeJulian DeltaF: If, when you test this, your file access times get excessive, I would reconsider only using enough replicas to satisfy fault tolerance.
18:05 hagarth joined #gluster
18:05 JoeJulian ~ext4 | rubbs
18:05 glusterbot rubbs: Read about the ext4 problem at http://goo.gl/PEBQU
18:05 flrichar joined #gluster
18:06 JoeJulian NeonLicht: You should be able to mount a volume from any (and I'm going to use this word correctly in this and only this instance) node on your network, whether it's a server or not.
18:06 aswickham http://fpaste.org/XQsB/
18:06 glusterbot Title: Viewing Paste #277496 by aswickham (at fpaste.org)
18:06 DeltaF So, only 2-3 replicas and servers 4+ just connect to one of the others in the group
18:06 Shdwdrgn joined #gluster
18:06 JoeJulian Yep
18:07 rubbs JoeJulian: thx
18:07 DeltaF I guess the issue still remains that when the NFS cache *isn't* a hit, it still goes through the whole transaction/heal
18:08 JoeJulian DeltaF: And every time that cache entry expires.
18:08 DeltaF Is there a way to have the NFS mount failover?
18:08 NeonLicht JoeJulian, I think I didn't ask the question right, JoeJulian, because I don't mean from where I try to acces but to where.  So, I shouldn't be able to access any exported directory in clients, which not even run glusterd.
18:08 DeltaF right. not hit == never request, expires
18:08 DeltaF never requested.
18:09 JoeJulian right, no coffee yet.
18:09 NeonLicht I'm sorry, JoeJulian, English isn't my first language and I'm afraid my English isn't good enough to ask the question properly.  :(
18:09 Mo___ joined #gluster
18:09 DeltaF Is there a graceful way to do nfs failover, or do I just have to scramble if one goes down? :)
18:10 JoeJulian DeltaF: You can still run the server locally, even if it isn't part of the volume. That still gives you the local nfs mount.
18:10 tqrst joined #gluster
18:10 DeltaF oh wonderful
18:10 DeltaF still wrapping my head around brick/volume/mount layers
18:10 JoeJulian DeltaF: One other wrench: there have been reports of lockups doing local nfs mounts.
18:11 DeltaF I suppose it's brick/volume/mount/share
18:11 DeltaF wha?
18:11 DeltaF got a link?
18:11 DeltaF I need to head out but I want to read up on that one.
18:12 z00mz00m JoeJulian: have you ever seen a gluster volume where when files are deleted the space isnt free'd?  it's a distributed block with only 1 share, and the .glusterfs directory seems to be taking up all the space
18:13 tqrst One of the servers that hosted 3 bricks in my 20x2 distributed-replicate volume just died. I want to transfer its 3 brick hard drives to a new server that wasn't previously part of my volume. Are the following steps OK for 3.3.1? 1) transfer drives to new server and mount them 2) gluster peer attach newserver 3) for each brick, "gluster volume myvol replace-brick DEAD:/brickN NEW:/brickN start" 4) same as 3 but "commit force" instead of start.
18:13 JoeJulian DeltaF: bug 849526
18:13 glusterbot Bug http://goo.gl/ZAcyz high, unspecified, ---, rajesh, ASSIGNED , High write operations over NFS causes client mount lockup
18:13 tqrst my main concern is that I want things to pick up where they left off, i.e. the bricks should stay in the same replica pair they used to be in, etc.
18:13 JoeJulian Looks like that was high writes though.
18:14 JoeJulian z00mz00m: I asked earlier if it could be hardlinked somewhere.
18:14 z00mz00m JoeJulian: ah missed it, sorry, no, no hardlinks within these 2 fs.  and its 2 that are doing it
18:15 JoeJulian NeonLicht: Still not completely clear, but you can mount the volume from any server to any client even if that client does not run glusterd.
18:17 JoeJulian tqrst: 1,2,4
18:17 tqrst JoeJulian: no need for 3 any more?
18:17 JoeJulian Not for what you're describing.
18:17 JoeJulian And, in fact, there's 2 issues.
18:17 tqrst (I was going off of http://community.gluster.org/q/a-replica-no​de-has-failed-completely-and-must-be-replac​ed-with-new-empty-hardware-how-do-i-add-the​-new-hardware-and-bricks-back-into-the-repl​ica-pair-and-begin-the-healing-process/)
18:17 glusterbot <http://goo.gl/4hWXJ> (at community.gluster.org)
18:18 tqrst but I guess that assumes the hard drives died too
18:18 JoeJulian It'll tell you that the path or prefix is already part of a...
18:18 tqrst (note that the hostname will change)
18:18 NeonLicht Thank you, JoeJulian.  I've been able to do it now.  I'm afraid the problems I had were caused by the 'fuse' module not being loaded in one of the servers!  Shame on me!
18:19 z00mz00m JoeJulian:  i put some further information into here, maybe it will help?  http://dpaste.org/5xaHb/
18:19 glusterbot Title: dpaste.de: Snippet #219233 (at dpaste.org)
18:19 JoeJulian So you could actually do 4 first, kill each gluserfsd as it comes up, then 1,2 then restart glusterd.
18:19 JoeJulian NeonLicht: Good news. :)
18:20 NeonLicht Yeah, JoeJulian, thank you! :-)
18:20 tqrst I have the feeling it might fail since the peer won't exist until I attach it
18:20 tqrst but here goes
18:21 JoeJulian Oops, right, 2,4,kill each gluserfsd as it comes up, then 1
18:22 venkat joined #gluster
18:23 glusterbot New news from newglusterbugs: [Bug 903396] Tracker for gluster-swift refactoring work (PDQ.2) <http://goo.gl/wiUbE> || [Bug 904370] Reduce unwanted Python exception stack traces in log entries <http://goo.gl/sjEQp> || [Bug 904629] Concurrent requests of large objects (GET/PUT) can be starved by small object requests <http://goo.gl/vtsQ0>
18:24 NeonLicht I still wonder why I don't have a hidden dir (.glusterfs/) on the root of my bricks as you do, z00mz00m?
18:24 JoeJulian z00mz00m: Each file in .glusterfs is a hardlink to the file in the rest of the filesystem tree. Find the big file in .glusterfs that you think should be deleted, get the inode for that file (ls -li) then find the file with that inode (find -inum). If it doesn't exist, check the logs for errors and file a bug report.
18:24 glusterbot http://goo.gl/UUuCq
18:24 JoeJulian NeonLicht: Because you're not running 3.3
18:25 NeonLicht I'm running 3.2.7, JoeJulian, indeed.
18:26 z00mz00m hrm, the find comes back with /proc/2486/stat
18:26 z00mz00m which is a vdsm process
18:26 z00mz00m thats odd
18:28 JoeJulian And lsof of that vdsm process probably shows the file that was deleted.
18:28 JoeJulian So since the fd is still open, the filesystem knows not to delete it. Once it's closed, it should delete.
18:29 z00mz00m yeah its weird, lsof doesnt return anything for that file, i had checked it before.  doesnt show anything for the process either.
18:29 bluefoxxx that's weird.
18:29 bluefoxxx SElf-healing ins't working 100%, but ... almost 100%
18:30 * bluefoxxx switches the mountpoint from nfs to glusterfs
18:30 z00mz00m so i took another file inode from .glusterfs because i have many to choose from, and that one doesnt return anything in find -inum
18:30 z00mz00m only the .glusterfs entry
18:30 z00mz00m would think if its just a vdsm process that the space would clear on reboot, it doesnt
18:31 bluefoxxx aha!
18:31 bluefoxxx that got it.
18:31 JoeJulian Ok, check your logs, maybe do a state dump (kill -USR1) for the mount process (glusterfs) and include that with your bug report.
18:32 JoeJulian bluefoxxx: Makes sense. That's why I dislike relying on fscache.
18:32 bluefoxxx JoeJulian, cachefilesd and fscache.ko I assume you're referencing
18:33 bluefoxxx JoeJulian, I have a @reboot cron script that runs find /mnt/exports -noleaf -print0 | xargs --null stat
18:33 bluefoxxx it's slow as hell even at 3 gigs
18:33 JoeJulian The nfs client (kernel) uses fscache.
18:33 bluefoxxx 133300 on the brick, 133343 on the export.
18:33 bluefoxxx inodes
18:33 JoeJulian bluefoxxx: You're running an older version? (<3.3)
18:34 bluefoxxx no I'm running
18:34 bluefoxxx $ gluster --version
18:34 bluefoxxx glusterfs 3.3.1 built on Oct 11 2012 21:49:37
18:34 JoeJulian Then why not just use "gluster volume heal $vol"?
18:34 bluefoxxx because
18:34 bluefoxxx I used google and it did not tell me I could do that.
18:34 JoeJulian hehe
18:34 bluefoxxx I found all kinds of videos and shit explaining how this works.
18:34 JoeJulian It's in the ,,(rtfm) ;)
18:34 glusterbot Read the fairly-adequate manual at http://goo.gl/E3Jis
18:35 * JoeJulian hates video tutorials.
18:35 JoeJulian I saw one recently that was a series of handwritten pages explaining how to do something.
18:36 NeonLicht :-)
18:36 NeonLicht I like talks.
18:36 bluefoxxx use heal info commands what
18:36 NeonLicht But I also hate tutorials.
18:36 JoeJulian gluster volume heal help
18:36 bluefoxxx volume help does not exist.
18:36 JoeJulian Ok, I've got to get in to the office finally. I've killed enough time here at home.
18:38 NeonLicht Thank you very much for killing it helping out, JoeJulian.  :-)
18:38 JoeJulian Ah, right... Love the inconsistency. Just "gluster volume heal"
18:38 JoeJulian someone should file a bug on that
18:38 glusterbot http://goo.gl/UUuCq
18:39 bluefoxxx JoeJulian, oh ... heal-failed gives a pile of output, interesting.
18:39 bluefoxxx not sure why.
18:42 disarone joined #gluster
18:44 bluefoxxx augh
18:44 bluefoxxx 2013-02-14 12:52:43 <gfid:f32135c8-fd71-4f2f-a54c-5bd98e281f35>
18:44 bluefoxxx /dev/mapper/datadisk-silo0 516057528   3648368 486194968   1% /mnt/silo0 ; hq-ext-store-2:/web  516057472   3736960 486106368   1% /mnt/exports/web
18:50 bluefoxxx JoeJulian, according to the documentation, self-healing was supposed to be a pro-active thing.
18:51 NeonLicht On a replicated volume you need to add bricks in pairs.  Are those pairs fixed forever?  I mean, imagine I add server1:/data and server2:/data and later on I add server3:/data and server4:/data.  Is it possible to remove server1:/data and server3:/data in the future or do they have to go in the same pairs as they are added?
18:58 * bluefoxxx sighs.  Removes the brick, formats the partition, replaces the brick.
18:59 sjoeboo NeonLicht: i'm not sure i follow you....so you start with a replicated setup, 1x2=2 basically....then you ad 2 more bricks (paired), becoming a dist-rep, 2x2=4
19:00 sjoeboo if you remove 1 and 3, you still have a 2x2=4 volume, but you only have the single replica, since you removed one from each pairing...
19:00 plarsen joined #gluster
19:02 bluefoxxx yes but apparently when you put bricks back in, you won't replicate anymore.
19:03 bluefoxxx there.  That find command seems to work, veen though the heal command does absolutely nothing.
19:04 turbo124 I am trying to geo-replicate some Virtual Machines, however during the replication process, the VMs are getting put into Read-Only mode file system, i presume due to the file locking being imposed by rsync, i haven't seen any literature that this occurs... just needed confirmation that this is expected behaviour?
19:10 NeonLicht I see, sjoeboo, and what happens if I rebalance then?  Do the two remaining bricks get replicas of each other?
19:10 sjoeboo hm, i woudn't think so
19:10 sjoeboo they are both replicas of other pairs
19:11 NeonLicht So, bricks are really paired up as they are in traditional raid=
19:11 sjoeboo i think you would actually delete the volume, scrub all the glsuter bits, and recreate over the data and then those remaining bricks would be paired
19:11 NeonLicht So, bricks are really paired up as *hard drives* are in traditional raid?
19:12 JuanBre joined #gluster
19:13 NeonLicht OK, I though at first that the redundancy was done similarly as to what btrfs does.  That's why I was so confused.
19:15 cyberbootje joined #gluster
19:17 bluefoxxx the hell just happen
19:23 * bluefoxxx tries removing the other brick after a heal.  Loses a few hundred megs of files.
19:26 tqrst JoeJulian: as I expected, replace-brick won't work since the peer doesn't exist yet, and 'peer probe mynewserver' fails with 'Probe returned with unknown errno 107'
19:28 dberry joined #gluster
19:28 dberry joined #gluster
19:29 tqrst the cli logs indicate "unable to find hostname: blah", where blah is a perfectly valid host name
19:29 tqrst same thing if I go by ip
19:29 Ryan_Lane joined #gluster
19:30 tqrst my bad, service glusterd start failed the first time around
19:30 Ryan_Lane does glusterd do something incredibly inefficient when it's doing volume start/stop/create/etc ?
19:30 bluefoxxx ok so lessons learned
19:30 bluefoxxx don't be writing to a glusterfs volume when one of your servers reboots.
19:31 Ryan_Lane start/stop/create and such are creating major problems in my cluster
19:32 Ryan_Lane the upstarts are also still broken for me
19:32 sjoeboo_ joined #gluster
19:33 Ryan_Lane is "expect fork" the actual behavior the glusterd process is doing? expect fork assumes the process will fork exactly once
19:34 theron joined #gluster
19:35 andrei_ joined #gluster
19:40 tomsve joined #gluster
19:43 jskinner_ joined #gluster
19:46 bauruine joined #gluster
19:56 semiosis Ryan_Lane: fixed a couple bugs in the upstart jobs for precise & quantal.  just uploaded latest fix this morning.
19:56 bluefoxxx how... what.
19:57 piotrektt_ joined #gluster
19:58 bluefoxxx How the heck is the volume 3.5G and the brick 3.3G
19:59 semiosis Ryan_Lane: most, if not all, commands done on the gluster cli require sync across the whole cluster, all glusterds in the pool
19:59 semiosis Ryan_Lane: well, the commands that change things anyway
19:59 Ryan_Lane yes, but the more volumes you add the slower the create/start/stop commands are
19:59 semiosis gluster volume info may just read local state
19:59 Ryan_Lane and they start eating incredible amounts of memory
20:01 Ryan_Lane and really, it seems that gluster just starts breaking down and dying once too many volumes exist
20:02 semiosis how many volumes are we talking about in this pool?
20:02 Ryan_Lane 300?
20:03 bluefoxxx how many disks do you have
20:03 Ryan_Lane 24 or so
20:03 bluefoxxx and you have 300 volumes
20:03 Ryan_Lane split into two raid 6 volumes that are LVM'd together
20:03 semiosis 300 volumes!!!
20:04 bluefoxxx ok what the shit
20:04 bluefoxxx is glustrefs compressing files?
20:04 Ryan_Lane I have 150 openstack projects. I have two volumes per project
20:04 jskinner_ joined #gluster
20:05 H___ joined #gluster
20:05 bluefoxxx [jrmoser@hq-ext-store-1 web]$ sudo find . -exec ls "/mnt/silo0/{}" \; > /dev/null
20:05 bluefoxxx this returns nothing.
20:05 semiosis bluefoxxx: glusterfs doesn't do compression (yet)
20:05 bluefoxxx /dev/disk/silo0      516057528   3421384 486421952   1% /mnt/silo0
20:05 bluefoxxx hq-ext-store-1:/web  516057472   3666048 486177280   1% /mnt/exports/web
20:06 bluefoxxx semiosis, how the heck is there less data on the actual physical brick as there is exported?
20:06 * Ryan_Lane sighs
20:06 Ryan_Lane I can't even run gluster volume status now
20:07 semiosis Ryan_Lane: that's a lot of volumes, i'd like to get the gluster devs' opinion on that.  i dont remember anyone coming in here with that many volumes
20:08 semiosis Ryan_Lane: i just gave a shout out about your issue in #gluster-dev, hopefully one of the devs is up & can weigh in on this
20:08 bluefoxxx for the life of me I can't get replication to stay consistent
20:09 bluefoxxx I broke it by rebooting one of the nodes while the cluster was being written to by rsync
20:09 semiosis Ryan_Lane: have you checked the glsuterd log?  /var/log/glusterfs/etc-glusterfs-glusterd.log
20:09 semiosis Ryan_Lane: that may provide insight into whats going on
20:09 bluefoxxx since then I've dropped both bricks
20:10 Ryan_Lane it's not terribly insightful
20:10 Ryan_Lane I see a ton of "disconnecting now" log entries
20:11 Ryan_Lane with some of these sprinkled in: 0-management: connection attempt failed (Connection refused)
20:11 semiosis have you tried restarting glusterd on the servers?  that shouldn't affect client access
20:11 semiosis bbiab, lunch
20:11 Ryan_Lane upstart script is broken
20:12 Ryan_Lane stop hangs
20:12 Ryan_Lane if I run start it says the process is already running, even if I kill the process
20:13 * bluefoxxx tries this again and sees if self healing loses even more data next round.
20:17 bluefoxxx ... WHAT?
20:18 bluefoxxx node #2 is self-healing with 'gluster volume heal web full'
20:18 bluefoxxx node #1's brick is getting BIGGER.
20:18 gbrand_ joined #gluster
20:18 bluefoxxx in the end, node #2's brick is smaller than node #1's brick or the exported volume.
20:24 WildPikachu joined #gluster
20:28 Ryan_Lane egrep "\<(fork|clone)\>\(" /tmp/strace.log | wc | awk '{print $1}'
20:28 Ryan_Lane 17
20:28 Ryan_Lane glusterd forks 17 times?
20:28 Ryan_Lane I'm betting more than that
20:28 nocko joined #gluster
20:28 Ryan_Lane it died at some point
20:30 Ryan_Lane oh. wonderful. it wiped out some info files too
20:31 bluefoxxx https://bugzilla.redhat.com/show_bug.cgi?id=911361
20:31 glusterbot <http://goo.gl/oSfTQ> (at bugzilla.redhat.com)
20:31 glusterbot Bug 911361: high, unspecified, ---, pkarampu, NEW , Bricks grow when other bricks heal
20:31 bluefoxxx This is the crappiest bug report I've ever written.
20:34 bluefoxxx http://supercolony.gluster.org/piperma​il/gluster-users/2012-July/033746.html
20:34 glusterbot <http://goo.gl/RqUBm> (at supercolony.gluster.org)
20:34 bluefoxxx same shit.
20:34 bluefoxxx Of course it's on the mailing list which the pattern behavior is "don't answer things that aren't user error"
20:35 plarsen joined #gluster
20:41 Ryan_Lane egrep "\<(fork|clone)\>\(" /tmp/strace.log | wc | awk '{print $1}'
20:41 Ryan_Lane 697
20:41 Ryan_Lane :D
20:41 Ryan_Lane I can definitely see why the upstart isn't working
20:42 semiosis Ryan_Lane: s/glusterd/glusterfs-server/ ?
20:42 semiosis i pushed that fix this morning
20:42 semiosis :(
20:42 Ryan_Lane no, this is glusterd
20:42 semiosis oh wait you're not mounting from localhost irrc
20:42 semiosis iirc
20:42 Ryan_Lane right
20:42 Ryan_Lane this is just glusterd
20:42 semiosis why isnt upstart working for you?
20:42 Ryan_Lane expect fork <— that's not going to work
20:42 semiosis is this precise or lucid?
20:42 Ryan_Lane precise
20:42 semiosis eh?
20:43 semiosis i thought that worked
20:43 Ryan_Lane expect fork assumes that glusterd will fork exactly once
20:43 semiosis hrmph
20:43 Ryan_Lane it forks way, way more than once
20:43 semiosis can you give me steps to reproduce?
20:43 semiosis the problem youre seeing
20:43 jiffe98 is there a graceful way to pull a replica out for maintenance?
20:43 Ryan_Lane semiosis: http://upstart.ubuntu.com/cookbook/#expect-fork
20:43 glusterbot Title: Upstart Intro, Cookbook and Best Practises (at upstart.ubuntu.com)
20:44 Ryan_Lane start the daemon with strace, check the number of times it forks in the strace output
20:45 semiosis Ryan_Lane: i mean from an end user's pov, what steps do i follow, what should i expect, what do i see instead?
20:45 semiosis jiffe98: i killall gluster* processes
20:46 semiosis jiffe98: maybe also firewall that host off afterward
20:46 semiosis to prevent reconnect until i'm ready
20:46 badone joined #gluster
20:47 Ryan_Lane well, I expect for start glusterfs-server and stop glusterfs-server to work
20:47 Ryan_Lane instead they hang
20:47 Ryan_Lane because upstart isn't tracking the process properly
20:50 * semiosis boots the precise test vm
20:50 semiosis oh, it's still running \o/
20:53 cw joined #gluster
20:54 Ryan_Lane when running glusterd in debugging I see some bricks being marked as stopped and started over and over again
20:54 glusterbot New news from newglusterbugs: [Bug 911361] Bricks grow when other bricks heal <http://goo.gl/oSfTQ>
20:55 semiosis Ryan_Lane: i am unable to reproduce that problem :(
20:55 semiosis service glusterd {stop,start} work just fine for me
20:55 Ryan_Lane make a bunch of volumes
20:55 semiosis i have three
20:55 semiosis foo, bar, and baz :)
20:55 Ryan_Lane hm
20:56 semiosis my tests
20:56 Ryan_Lane I do have one difference
20:56 Ryan_Lane limit nofile 40960 40960
20:56 semiosis idk how that would matter, but sure lets try
20:56 Ryan_Lane I don't see why it would either
20:57 NeonLicht What's the difference between 'replicated' and 'distributed replicated' volumes, please?  I can't see it from reading the Admin Guide.
20:58 semiosis NeonLicht: maybe ,,(semiosis tutorial) would be helpful
20:58 glusterbot NeonLicht: http://goo.gl/6lcEX
20:58 semiosis or possibly ,,(joe's blog)
20:58 glusterbot http://goo.gl/EH4x
20:58 semiosis distributed replicated is basically making a distributed volume out of more than one replicated "subvolumes"
20:59 NeonLicht Thank you, semiosis.
20:59 semiosis yw
20:59 NeonLicht Reading the blog and the tutorial now... :)
21:00 bluefoxxx semiosis, how does healing cause the donor to expand?
21:00 semiosis bluefoxxx: no idea
21:00 bluefoxxx is that normal?
21:00 semiosis maybe xattrs?
21:01 semiosis bluefoxxx: never seen that, but then again i may not have tried what you're doing
21:01 randomcamel joined #gluster
21:01 bluefoxxx I dropped a brick out, reformatted it, re-introduced it, and then ran 'gluster volume heal web full'
21:01 semiosis when you say 'donor' you mean you created a new replicate volume with one preloaded brick and one empty brick?
21:01 bluefoxxx with nothing else accessing the volume
21:01 semiosis hm interesting
21:01 bluefoxxx so the new empty brick fills up to what the size of the gluster volume WAS
21:01 bluefoxxx meanwhile, the other server's brick gets 0.25GB bigger
21:02 bluefoxxx and the volume itself grows to be the same space usage as the more full brick
21:02 bluefoxxx this was all triggered by me loading up the volume via rsync and rebooting one of the servers, and having the rebooted server corrupt itself so badly that it couldn't self-heal :|
21:03 randomcamel hey all. I just tried running through the quickstart at http://www.gluster.org/community/d​ocumentation/index.php/QuickStart and ran into a problem. when I do `mount -t glusterfs gluster1.ots.ooyala.com /gluster` on a brick as the tutorial suggests, I get "[2013-02-14 20:59:55.522212] C [glusterfsd.c:1220:parse_cmdline] 0-glusterfs: ERROR: parsing the volfile failed (No such file or directory)".
21:03 glusterbot <http://goo.gl/OEzZn> (at www.gluster.org)
21:03 jskinner_ joined #gluster
21:03 bluefoxxx I was fairly certain somewhere I lost a few hundred megabytes of data too, but now I'm not so sure
21:04 * bluefoxxx is frustrated and in stage 4 of burn-out, should probably not be interacting with people
21:04 semiosis Ryan_Lane: increased nofile, but still works
21:05 randomcamel hmmm. I bet this is my fault for trying to run with reiserfs brick filesystems.
21:06 semiosis i'll be in & out over the next couple hours, feel free to address me though & i'll do my best to respond when I can
21:07 NeonLicht semiosis, your tutorial is great, I'm reading the whole thing.  Since I wasn't there when you gave it, would you mind my asking some questions now?  :-)
21:07 semiosis randomcamel: your mount command is not well formed, see ,,(mount server)
21:07 glusterbot randomcamel: (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds, or (#2) Learn more about the role played by the server specified on the mount command here: http://goo.gl/0EB1u
21:07 semiosis randomcamel: missing the volume name
21:08 semiosis randomcamel: also seems you may be confusing terms, so ,,(glossary)
21:08 glusterbot randomcamel: A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
21:08 randomcamel er, sorry. it's `mount -t glusterfs gluster1.ots.ooyala.com:/gv0 /gluster`
21:08 randomcamel mispasted
21:08 semiosis NeonLicht: yes feel free to ask, like i said i'll be in & out but i'll try to respond when i can, or someone else may have an answer before i do
21:08 semiosis NeonLicht: and thx for the positive feedback :)
21:09 randomcamel "Initialization of volume 'fuse' failed, review your volfile again"
21:09 Ryan_Lane yeah, the upstart job is definitely tracking the incorrect pid
21:09 Ryan_Lane not sure how/why
21:09 randomcamel "0-glusterfs-fuse: cannot open /dev/fuse (No such file or directory)"
21:09 semiosis randomcamel: that shouldn't happen when you use the gluster cli to create & manage volumes... you shouldn't edit them by hand
21:09 semiosis randomcamel: oh hm, modprobe fuse?
21:09 JuanBre joined #gluster
21:09 semiosis gtg, bbiab
21:10 NeonLicht semiosis, I'm wondering about CloudInit + Puppet and how it would compare with OpenNebula and OneSIS, which I've been planning to use together with GlusterFS on btrfs.  Do you have experience/an oppinion on them?
21:10 randomcamel VICTORY IS MINE
21:10 randomcamel semiosis: awesome, thank you. =)
21:10 semiosis NeonLicht: never heard of those, sorry
21:11 semiosis randomcamel: awesome, yw :)
21:11 semiosis ok for real now
21:11 * semiosis &
21:12 NeonLicht Oh, I see, semiosis, OneSIS deploys servers using a single directory tree, and OpenNebula manages clusters localy, on the cloud (EC2 et al.), or mixed.  I guess it's somehow similar to your setup.
21:22 tqrst JoeJulian: welp, replace-brick doesn't work without the new server being probed and online, and if I do probe it and keep it online, then I get "/mnt/donottouch/localb or a prefix of it is already part of a volume" when running replace-brick (if it makes any difference, brick paths are standardized across servers so that all servers have /mnt/donottouch/local{b,c,d})
21:22 glusterbot tqrst: To clear that error, follow the instructions at http://goo.gl/YUzrh or see this bug http://goo.gl/YZi8Y
21:23 tqrst is replacing a dead server with another one supposed to be so complicated?
21:23 tqrst why can't I just pop the disks out of the dead server, pop them into the new one, and do "gluster volume myvol oh hey deadnode is now alivenode, figure it out"
21:25 johnmark Ryan_Lane: how many servers?
21:25 Ryan_Lane johnmark: 4
21:29 chouchins joined #gluster
21:30 chouchins joined #gluster
21:31 johnmark Ryan_Lane: so you have 300 volumes on 4 servers, and you're getting poor performance
21:31 Ryan_Lane it's not the performance I'm complaining about
21:31 johnmark do you have that many volumes for multi-tenancy
21:31 Ryan_Lane yes
21:31 johnmark Ryan_Lane: oh, it's the control path
21:31 johnmark ok, got it
21:31 Ryan_Lane well, it's a bunch of things
21:31 johnmark o
21:32 johnmark k
21:32 Ryan_Lane one problem I'm fighting is the upstart
21:32 Ryan_Lane this, however, is a major problem: https://bugzilla.redhat.com/show_bug.cgi?id=907540
21:32 glusterbot <http://goo.gl/jNFB3> (at bugzilla.redhat.com)
21:32 glusterbot Bug 907540: unspecified, unspecified, ---, vbellur, NEW , Gluster fails to start many volumes
21:33 johnmark ok
21:33 Ryan_Lane also, gluster volume start/stop/create take forever, eat tons of memory and cpu, and cause glusterd to become completely unresponsive for 20-30 seconds
21:33 johnmark Ryan_Lane: did you report that bug?
21:34 johnmark that's... interesting
21:34 Ryan_Lane I've had 3 outages in the past two weeks
21:34 johnmark ok
21:34 JuanBre joined #gluster
21:34 johnmark I want to make sure we get all the right information for our dev summit in March
21:35 johnmark because as we make the KVM integration mo' better for teh data path
21:35 johnmark I'm guessing that the control path will lag
21:35 Ryan_Lane well, I'm not using kvm
21:35 johnmark I know - but it will add to this very problem
21:35 Ryan_Lane I have two volumes per project that are being shared directly to the instances
21:35 * Ryan_Lane nods
21:35 johnmark ok
21:36 johnmark Ryan_Lane: did you put your info to that bug?
21:36 johnmark er in
21:36 johnmark gah
21:36 Ryan_Lane which info?
21:36 johnmark Ryan_Lane: hang on. let me actually view the ug
21:36 johnmark bug
21:38 johnmark Ryan_Lane: oh, you're tehh bug reporter. good :)
21:38 johnmark Ijust want to make sure all the info you're giving here is capured
21:39 andreask joined #gluster
21:43 johnmark Ryan_Lane: so not that it helps you in this particular case, but glusterd is going multi-threaded in 3.4
21:49 fabio|gone joined #gluster
21:54 glusterbot New news from newglusterbugs: [Bug 907540] Gluster fails to start many volumes <http://goo.gl/jNFB3>
22:00 _br_ joined #gluster
22:07 _br_ joined #gluster
22:08 xian1 I removed files and parent directories from bricks and .glusterfs (v 3.3.1) seemingly successfully, but there's something preventing me from recreating the directory via the fuse mount point.  If I want to recreate the directory named "9", it says "mkdir: cannot create directory `9': File exists"; it is not visible via any brick or client mount point.  If I try to remove the directory, I get "[fuse-bridge.c:1029:fuse_unlink_cbk] 0-glusterfs-fuse: 29317: RMD
22:27 xian1 ok I fixed it.  repeated umount of fuse file system on server w/ wedged rsync; then remounted, rsync process released.  after that, I was able to mkdir 9 and the dir and gfid is identical on all nodes.  note that I was using joejulian's blog info from 'fixing split-brain with glusterfs 3.3"
22:38 DeltaF joined #gluster
22:42 raven-np joined #gluster
22:46 raven-np1 joined #gluster
22:49 tryggvil joined #gluster
23:00 aliguori joined #gluster
23:03 mooperd joined #gluster
23:05 VeggieMeat joined #gluster
23:05 turbo124 joined #gluster
23:06 JordanHackworth joined #gluster
23:07 shireesh joined #gluster
23:07 arusso joined #gluster
23:07 nocko joined #gluster
23:08 daMaestro joined #gluster
23:09 TomS joined #gluster
23:09 hattenator joined #gluster
23:10 thekev joined #gluster
23:37 DeltaF Should I really be getting 220K/sec copy time onto a glusterfs? Seems terribly slow
23:50 _br_ joined #gluster
23:56 _br_ joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary