Camelia, the Perl 6 bug

IRC log for #gluster, 2012-12-03

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:11 kevein joined #gluster
00:54 yinyin joined #gluster
01:21 cyberbootje joined #gluster
02:01 manik1 joined #gluster
02:04 manik joined #gluster
02:13 manik joined #gluster
02:18 samkottler joined #gluster
02:18 samkottler joined #gluster
02:19 samkottler joined #gluster
02:21 samkottler joined #gluster
02:25 sunus joined #gluster
02:28 manik joined #gluster
02:31 designbybeck joined #gluster
02:37 manik joined #gluster
02:37 sunus hi, is anyone using mutt? how can i "pageup" reverse to 'enter' in a message?
02:37 bala joined #gluster
02:38 m0zes sunus: wouldn't that be a question better asked in #mutt ?
02:40 sunus m0zes: lol, i am reading gluster-user and devel..so.. anyway, i found this:)
02:40 sunus morse: sorry:)
02:58 bharata joined #gluster
03:09 syoyo__ joined #gluster
03:39 nick5 joined #gluster
03:40 nick5 joined #gluster
03:51 manik joined #gluster
04:09 vpshastry joined #gluster
04:19 yinyin joined #gluster
04:41 sripathi joined #gluster
04:54 hagarth joined #gluster
05:03 nightwalk joined #gluster
05:09 Humble joined #gluster
05:22 bulde joined #gluster
05:23 lng joined #gluster
05:23 yinyin_ joined #gluster
05:24 lng Hi! I have some duplicated directories on replicated volume. What does it mean?
05:25 lng And how to fix it?
05:32 vimal joined #gluster
05:45 sripathi joined #gluster
05:47 Humble joined #gluster
05:50 ramkrsna joined #gluster
05:53 bharata joined #gluster
06:07 glusterbot New news from newglusterbugs: [Bug 882780] make fails with error 'cli-xml-output.c:3173:48: error: unknown type name ‘xmlTextWriterPtr’ make[2]: *** [cli-xml-output.o] Error 1 make[1]: *** [install-recursive] Error 1" <http://goo.gl/Z1cDh>
06:09 Humble joined #gluster
06:11 raghu joined #gluster
06:13 lng I think it is caused by split-brain files
06:22 bala joined #gluster
06:24 yinyin joined #gluster
06:25 ankit9 joined #gluster
06:33 vijaykumar joined #gluster
06:34 mohankumar joined #gluster
06:49 syoyo__ joined #gluster
06:49 guigui3 joined #gluster
06:50 nick5 joined #gluster
06:52 bharata joined #gluster
07:08 quillo joined #gluster
07:10 inodb_ joined #gluster
07:12 yinyin joined #gluster
07:19 GLHMarmot joined #gluster
07:22 sripathi joined #gluster
07:29 ngoswami joined #gluster
07:38 ngoswami joined #gluster
07:45 ekuric joined #gluster
07:48 inodb joined #gluster
07:49 ngoswami joined #gluster
07:56 rudimeyer_ joined #gluster
08:04 ctria joined #gluster
08:06 shireesh joined #gluster
08:24 lkoranda joined #gluster
08:27 andreask joined #gluster
08:28 dobber joined #gluster
08:30 Jippi joined #gluster
08:43 mdarade1 joined #gluster
08:54 webwurst joined #gluster
09:02 bulde1 joined #gluster
09:02 DaveS_ joined #gluster
09:04 tjikkun_work joined #gluster
09:21 gbrand_ joined #gluster
09:25 lng JoeJulian: Hello! In case of heal-failed you suggested to delete links in .glusterfs, but how about split-brain? Should I delete these gfids too?
09:29 sripathi joined #gluster
09:31 stre10k joined #gluster
09:31 stre10k hi
09:31 glusterbot stre10k: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
09:34 mooperd joined #gluster
09:37 stre10k "gluster volume top <VOLNAME> open nfs" on gluster server v.3.3.1 don't show information, volume mounted by nfs on other server
09:41 yinyin joined #gluster
09:41 stre10k how to make it to work?
09:43 H__ auw ! geo-replicate [master:170:crawl] GMaster: ... done, took 152041.222402 seconds. That's 42 hours. And then it dies and starts all over.
09:49 sripathi joined #gluster
09:56 Staples84 joined #gluster
09:56 Azrael808 joined #gluster
09:59 toruonu joined #gluster
10:00 duerF joined #gluster
10:10 harshpb joined #gluster
10:11 lanning joined #gluster
10:19 guigui1 joined #gluster
10:22 sripathi joined #gluster
10:25 stre10k is this performance data normal? http://pastebin.com/eW4NA3BR
10:25 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
10:26 stre10k copy on http://dpaste.org/7BgUJ/
10:26 glusterbot Title: dpaste.de: Snippet #214276 (at dpaste.org)
10:30 bauruine joined #gluster
10:31 stre10k LOOKUP Avg-latency 9295.63 us
10:36 mdarade1 left #gluster
10:38 glusterbot New news from newglusterbugs: [Bug 882127] The python binary should be able to be overridden in gsyncd <http://goo.gl/cnTha>
10:44 ankit9 joined #gluster
10:46 bulde joined #gluster
10:48 sripathi joined #gluster
10:50 toruonu I have a weird state. I'm getting from an application that the local file SQLite database is locked and that for all tasks. Running an strace I get this:
10:50 toruonu 836   fcntl(4, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=0, len=1}) = -1 EACCES (Permission denied)
10:51 toruonu the way we use gluster is for /home using an NFS mount. The location where the NFS was taken had an issue this morning where it didn't respond to any outside stimuly (ssh to it failed etc). So I umounted /home using lazy umount and mounted from another server
10:51 toruonu after that this error started
10:51 toruonu I can browse around the filesystem and create new files, but oddly enough I can't use this tool that wants to set a lock
10:52 Hymie you may need to restart NFS on the NFS server... NFS clients can set a lock, and I've seen cases where the client crashes, and the server holds the lock -- that's rare, but it happens
10:52 Hymie I'd think this before gluster, since NFS likes to be evil
10:52 Hymie /etc/init.d/<various NFS stuff> restart
10:52 Hymie or reboot the server
10:52 Hymie did you try that?
10:54 toruonu well I've got users on it :) loads of them :)
10:54 toruonu right now I sent a wall message asking everyone to log out so that I can restart the VM
10:54 Hymie VM means nothing to me.  Do you mean your NFS server? (I don't know if your VM is the client or server or both)
10:54 toruonu the nfs mount comes from glusterd so I can't restart the NFS there :)
10:55 toruonu I could restart glusterd on all of them though … or which part takes care of NFS export?
10:55 toruonu it's not the system NFS export afaik … and the original NFS server has been rebooted, but the client has already been remounted from another NFS server (another server serving bricks)
10:55 Hymie I'm not a big gluster guy, but I'd be surprised if NFS has anything to do with glusterfs, other than it's running on a server somewhere
10:56 Hymie hmm
10:56 toruonu no, as far as I know gluster exports its own NFS
10:56 toruonu and you CAN'T have the system wide NFS server running on those nodes
10:56 toruonu the rpc stuff though is probably system wide
10:56 Hymie that seems very strange, why reinvent the code base, we have enough troubles with NFS without having a fork in the code
10:57 toruonu I don't know the exact implementation so it may use the nfs libraries from the OS or smth …
10:57 Hymie ok, well, I have to get to work... my knowledge ends in that I know I've had weird NFS locking issues, that were only resolved with the NFS server itself being restarted (due to an NFS client bugging/crashing)
10:57 Hymie toruonu: most likely you just have an NFS server running somewhere
10:57 Hymie but again
10:57 Hymie have to go :(
10:57 Hymie good luck though
10:57 toruonu all of the nodes with bricks have it running :)
10:58 toruonu you can mount it from anywhere
10:58 toruonu and somehow it's consistent
10:58 toruonu don't ask me how
10:58 ndevos Hymie: it is because fuse does not support being a backend for the kernel nfsd server
10:58 Hymie ndevos: interesting
10:58 toruonu ndevos: got any ideas how I can get out of this mess?
10:58 ndevos Hymie: its documented in the README.nfs from the fuse package ;)
10:59 Hymie ndevos: have to read that sometime then
10:59 toruonu in theory I swapped everything from one to another … the client is mounted from another NFS server and the original server restarted
11:00 Hymie if they're all interlinked though, then they need to share locking info
11:00 inodb_ joined #gluster
11:00 toruonu that is indeed logical… then the question becomes how do you reset the locking info...
11:01 * Hymie .. on single server usage as he's had in the past, had this problem once every 1 1/2 years or so
11:01 ndevos toruonu: you may get EACCESS errors when the file has a split-brain and can not be (or is in the progress of) healed
11:01 ndevos ]/win 9
11:01 Hymie and, always took that moment to do a kernelupgrade anyhow... so, never looked around it
11:01 toruonu http://fpaste.org/g1oq/
11:01 glusterbot Title: Viewing Gathering Heal info on volume home0 ... Brick 192.168.1.240:/d35 Number of ... .168.1.240:/d36 Number of entries: 0 (at fpaste.org)
11:02 toruonu info split-brain shows no files there either
11:02 toruonu and it's affecting ALL crab tasks of which I have many … all of them attempt a lock on the task database (each has its own) and fails
11:02 toruonu so it's the mount issue
11:02 stre10k how to tune LOOKUP latency?
11:03 toruonu stre10k: I just recently battled with that … what are you actually trying to tune away? :)
11:04 guigui3 joined #gluster
11:04 toruonu I saw our volumes spend 99% of the time in lookup in latency measurements and the major reason was that we got a lot of misses for files that don't exist
11:04 toruonu ENOENT etc
11:05 toruonu we moved then from fuse mount to NFS mount to get the caching of negative lookups
11:05 toruonu and in general directory caching etc that sped up everything substantially
11:08 stre10k toruonu: i want to reduce the latency of gluster service, the 9 sec lookup I think is to high
11:08 toruonu how is it mounted right now? fuse?
11:09 stre10k toruonu: nfs
11:09 toruonu hmm… interesting… what do you do that takes 9s? because I can get a relateively quick answer to ls --color even in a folder with 3500 files in it
11:10 toruonu and --color does a stat() syscall on every file that in fuse case caused a self-heal check and took in total hundreds of seconds
11:10 ndevos toruonu: not sure what would cause that locking issue though... could it be that an other process still has a lock on that file? (and a lazy umount does not clear anything kept open/in-use)
11:11 stre10k toruonu: there may be many negative lookups now
11:11 toruonu ndevos: to attempt that as a fix I just restarted the VM (the only possible location what could have kept the old mount running)
11:12 toruonu stre10k: well the negative lookups took 99% of the time in my profiling and I got rid of them with the NFS mount
11:12 toruonu at least the slowdown is not noticable
11:12 toruonu a task that on local disk took ca 2+ minutes takes now on gluster also 2+ minutes instead of 30 minutes
11:13 toruonu ndevos: seems to have indeed been the case of stale nfs handles in the lazy umount … right now at least crab is checking status of jobs and previously it crashed here within seconds, now it's taking its time so either gluster is waaay slow or it's actually working
11:13 inodb joined #gluster
11:25 toruonu *sigh*
11:25 toruonu seems it didn't fix it… but instead indeed gluster was extremely slow
11:26 toruonu probably the first command creates the NFS negative lookup cache ...
11:26 stre10k toruonu: I switch the nginx load to nfs mount and take this stats http://dpaste.org/KrzRC/
11:26 glusterbot Title: dpaste.de: Snippet #214278 (at dpaste.org)
11:27 toruonu 55ms / call isn't that bad I think … though maybe experts can say how to tune that even further down…. if it's DHT misses that are impacting you, then you can try turning off unhashed lookup as JoeJulian recommended for me…
11:28 toruonu http://joejulian.name/blog​/dht-misses-are-expensive/
11:28 glusterbot <http://goo.gl/A3mCk> (at joejulian.name)
11:28 toruonu cluster.lookup-unhashed: off
11:33 mohankumar joined #gluster
11:38 toruonu ok guys … this is kind of a showstopper right now
11:38 toruonu the whole locking stuff is making the whole thing not work
11:39 toruonu and now that the VM has been restarted there's no way a stale process could be using the lock
11:43 gbrand__ joined #gluster
12:00 guigui3 joined #gluster
12:07 ankit9 joined #gluster
12:12 inodb_ joined #gluster
12:14 inodb^ joined #gluster
12:14 mario_ joined #gluster
12:16 toruonu weird network hiccup… so … I'm still stuck with nfs locking problems
12:16 toruonu how can I trace across all possible mounts who keeps the lock?
12:16 toruonu shouldn't gluster know that? how else it claims there is no permission?
12:18 vimal joined #gluster
12:23 tryggvil joined #gluster
12:23 tryggvil_ joined #gluster
12:30 __Bryan__ left #gluster
12:35 toruonu hmm… I'm getting on the HN this:
12:35 toruonu [1038508.865280] lockd: cannot monitor 192.168.1.244
12:37 toruonu baaaah
12:37 toruonu service nfslock restart fixed it all
12:37 * toruonu goes behind a corner and shoots himself
12:45 inodb joined #gluster
12:51 ankit9 joined #gluster
12:56 hagarth joined #gluster
13:00 balunasj joined #gluster
13:07 vijaykumar left #gluster
13:08 plarsen joined #gluster
13:08 designbybeck joined #gluster
13:11 __Bryan__ joined #gluster
13:16 H__ Anyone here running FineFS ( http://code.google.com/p/finefs/ ) as (php web oriented) alternative to geo-replication ?
13:16 glusterbot Title: finefs - Replicated network filesystem, easy to deploy on web clusters - Google Project Hosting (at code.google.com)
13:16 gbr joined #gluster
13:19 lanning joined #gluster
13:34 Norky joined #gluster
13:37 webwurst H__: FineFS looks outdated?
13:39 H__ webwurst: certainly not in wide use, the bin/ was updated 4 months ago however (on github)
13:45 webwurst H__: i don't know finefs but since 2009 there don't seem to happen much. on github is one commit with nine changed lines.
13:47 plarsen joined #gluster
13:59 harshpb joined #gluster
14:00 jdarcy ChironFS has gone without changes for even longer.  :(
14:02 aliguori joined #gluster
14:02 tryggvil__ joined #gluster
14:03 tryggvil___ joined #gluster
14:04 robo joined #gluster
14:06 chirino joined #gluster
14:08 harshpb joined #gluster
14:09 stre10k left #gluster
14:09 gbrand_ joined #gluster
14:14 Norky joined #gluster
14:17 harshpb joined #gluster
14:21 harshpb joined #gluster
14:32 olisch joined #gluster
14:32 harshpb joined #gluster
14:33 harshpb joined #gluster
14:35 mohankumar joined #gluster
14:37 rwheeler joined #gluster
14:37 tryggvil_ joined #gluster
14:37 tryggvil joined #gluster
14:40 harshpb joined #gluster
14:41 olisch hey guys, i have a question concerning gluster 3.2.6. i have a glusterfs setup with some distribute and distributed-replicate volumes distributed over 14 bricks. the bricks are added to the volume using their ip address instead of a hostname. because of some major infrastructure changes i will have to renumber the ip addresses of these 14 bricks. is that possible by just destroying and recreating the volumes with the new ip addresses (without changi
14:42 Norky joined #gluster
14:42 harshpb joined #gluster
14:45 harshpb joined #gluster
14:46 nightwalk joined #gluster
14:46 stopbit joined #gluster
14:47 harshpb joined #gluster
14:48 harshpb joined #gluster
14:55 harshpb joined #gluster
14:57 Norky joined #gluster
15:08 JoeJulian olisch: It's possible, but it's probably easier just to recreate them. You /can/ change them in the /var/lib/glusterfs/vols directories if everything is stopped.
15:11 Norky joined #gluster
15:12 Norky joined #gluster
15:12 Norky joined #gluster
15:16 Norky joined #gluster
15:21 olisch joejulian: thx, i will try that with a test setup
15:24 3JTAAAZM0 joined #gluster
15:26 UnixDev joined #gluster
15:26 UnixDev left #gluster
15:26 UnixDev joined #gluster
15:28 gbr joined #gluster
15:29 gbr What does the folowing mean in the self heal log:
15:29 gbr [2012-12-03 09:29:49.713135] W [client3_1-fops.c:1059:client3_1_getxattr_cbk] 0-NFS_RAID6_FO-client-1: remote operation failed: No such file or directory. Path: <gfid:1f07b87c-1833-48ae-a12c-e8483ac08a75> (00000000-0000-0000-0000-000000000000). Key: glusterfs.gfid2path
15:31 JoeJulian getxattr_cbk is the callback for (obviously) getxattr. That sais that on volume NFS_RAID6_FO client 1 (numbered from 0) the remote failed a getxattr call because there was no such file or directory for .glusterfs/1f/07/1f07b87c-1​833-48ae-a12c-e8483ac08a75
15:32 JoeJulian I should really proofread before I hit enter.... sais? really?
15:32 * JoeJulian needs coffee
15:32 gbr Interesting, the other replicate node is getting 'inode link failed on the inode'
15:32 gbr Coffee is always good.
15:34 gbr Node 1 went down over the weekend, and this is what I get this morning.   I was serving NFS (glusterNFS) over node 0, and all the NFS clients lost connection at the same time Node 1 went down.
15:34 gbr Not a good morning.
15:35 JoeJulian Are you using ucarp or something to float an ip address for your nfs clients to connect to?
15:36 gbr yup.  ucarp.  the IP should never have moved though.
15:36 JoeJulian Oh, right.. just rearead that.
15:36 JoeJulian GAh! I can't type this morning.
15:38 gbr I'm thinking of stopping gluster on node 1, reformatting the base share (currently NFS), rstarting gluster and have a full re-sync.
15:39 gbr sorry, currently XFS.
15:39 gbr I may try ext4.
15:39 JoeJulian #ext4
15:39 JoeJulian @ext4
15:39 glusterbot JoeJulian: Read about the ext4 problem at http://goo.gl/PEBQU
15:40 JoeJulian And there hasn't been any updates on that bug on that bug report or on http://review.gluster.org in months so I think ext4 has been abandoned.
15:40 glusterbot Title: Gerrit Code Review (at review.gluster.org)
15:41 gbr I'll stick with XFS.  Does gluster like any special formatting or mount options?
15:42 JoeJulian -i size=512 is good. 1024 if you plan on using posix acls.
15:43 gbr no ACL's.  I'm hosting a bunch of Virtual Machine imagges on Gluster
15:44 toruonu I use ext4
15:45 toruonu works flawlessly …
15:45 JoeJulian yikes
15:45 toruonu except we don't use the affected kernel
15:45 toruonu :D
15:45 JoeJulian yet
15:45 toruonu nah … don't see any reason to change OS or what not on those servers
15:45 toruonu they are pure storage servers running gluster and hadoop
15:45 gbr I'm running a stock Ubuntu 12.04.1 kernel, so I'll stay away from ext4
15:45 toruonu doing nothing else :)
15:45 toruonu and anything CERN related is still stuck with Scientific Linux 5.x
15:47 JoeJulian I'm just surprised that bug didn't end up as a critical one. Reformatting a system doesn't seem like a reasonable workaround and postponing kernel security patches because of this also seems unreasonable.
15:50 gbrand_ joined #gluster
15:50 tqrst is the hardlink count for folders on a brick supposed to be consistent across bricks? I have 25 hardlinks pointing to / on 32 bricks, but 26 on the remaining 8.
15:51 gbr I just stopped my gluster share on node 1, and node 0 stopped serving NFS long enough for my VM's to lose their HDD's
15:51 gbr running 3.3.0 on Node 0 and 3.3.1 on node 1
15:52 H__ Anyone have a solution to this ? -> geo-replicate [master:170:crawl] GMaster: ... done, took 152041.222402 seconds. That's 42 hours. And then it dies and starts all over.
15:52 JoeJulian gbr: how are you stopping it?
15:52 gbr or maybe when I restarted the node after formating the drive:[2012-12-03 09:50:04.867016] E [afr-self-heal-common.c:2156:​afr_self_heal_completion_cbk] 0-NFS_RAID6_FO-replicate-0: background  meta-data data entry self-heal failed on <gfid:14bb0339-9d70-4a17-97d6-b8429742bd2f>
15:53 gbr I did a gluster volume stop NFS_RAID6_FO
15:53 JoeJulian tqrst: You can't hardlink directories so that seems confusing.
15:53 tqrst JoeJulian: it appears that I also need coffee
15:54 tqrst JoeJulian: second entry of ls -alFd must be something else then
15:54 * tqrst scratches his head
15:55 JoeJulian H__: Not sure if 42 hours is reasonable for your dataset. From what you're saying, though, the "dies" part seems to be the problem there. Probably would be good to look more at that part (logs, coredumps, etc.)
15:55 tqrst oh - number of folders/links inside the folder
15:55 tc00per joined #gluster
15:56 toruonu And I'm NOT going back to XFS ever again
15:56 toruonu I lost 300TB due to XFS until we moved over to EXT4
15:56 JoeJulian toruonu: How long ago was that?
15:56 toruonu you or someone already asked :) ca 1 month
15:57 toruonu and THAT was on recent kernels
15:57 * JoeJulian looks surprised.
15:57 toruonu we had unstable power distribution grid after a major rebuild of the DC
15:57 toruonu so we at random lost power to chunks of 20 servers at a time
15:57 toruonu depending on load
15:57 toruonu it took us a month and ca 10 random power losses to balance the grid
15:57 JoeJulian Interested in learning more about that, but I've got to run. Did you write anything up somewhere to read about how xfs caused that?
15:58 lh joined #gluster
15:58 lh joined #gluster
15:58 toruonu in that time the hadoop volumes that were on XFS AND with double replication lost files all the time
15:58 toruonu every crash was ca 1M blocks lost
15:58 toruonu that's enough to have both replicas killed for a number of files
15:58 toruonu I didn't blog yet at the time :D
15:58 toruonu I have been contemplating writing it up again
15:59 tqrst Is the ".landfill" folder gluster-related? A few of my bricks have that in the root. Google isn't very useful on this.
16:00 gbr looking at the self heal logs, it looks like gluster 3.3.0 did a restart when a replicate was stopped or started:
16:00 toruonu basically the issue is something that other CERN Tier 2 centers have seen as well
16:00 toruonu the file ends up with 0 size after power loss
16:00 gbr [2012-12-03 09:49:57.015487] E [client-handshake.c:1717:client_query_portmap_cbk] 0-NFS_RAID6_FO-client-1: failed to get the port number for remote subvolume
16:00 gbr [2012-12-03 09:49:57.015551] I [client.c:2090:client_rpc_notify] 0-NFS_RAID6_FO-client-1: disconnected
16:00 gbr [2012-12-03 09:49:57.015574] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-NFS_RAID1_FO-client-1: changing port to 24009 (from 0)
16:00 toruonu it's a FS table corruption somehow
16:00 toruonu because of the short writes happening at the time of power loss
16:00 gbr [2012-12-03 09:49:57.015608] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-NFS_RAID6_FO-client-0: changing port to 24012 (from 0)
16:00 gbr [2012-12-03 09:49:57.015632] I [rpc-clnt.c:1660:rpc_clnt_reconfig] 0-NFS_RAID1_FO-client-0: changing port to 24011 (from 0)
16:00 gbr [2012-12-03 09:49:59.946218] W [socket.c:410:__socket_keepalive] 0-socket: failed to set keep idle on socket 8
16:00 tqrst ah: #define TRASH_DIR "landfill"
16:00 gbr [2012-12-03 09:49:59.946277] W [socket.c:1876:socket_server_event_handler] 0-socket.glusterfsd: Failed to set keep-alive: Operation not supported
16:00 gbr [2012-12-03 09:50:00.838737] I [client-handshake.c:1636:sele​ct_server_supported_programs] 0-NFS_RAID6_FO-client-0: Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
16:01 glusterbot use fpaste.org or dpaste.org instead of flooding the channel please.
16:03 andreask joined #gluster
16:04 zaitcev joined #gluster
16:04 balunasj joined #gluster
16:06 rwheeler joined #gluster
16:08 tqrst if I'm changing the mount point of a brick, is the following sufficient? gluster remove-brick $server:$brickmountpoint; umount $brickmountpoint; mount /dev/foobar $newmountpoint; gluster add-brick $server:$newmountpoint.
16:09 Norky localte
16:09 tqrst I'm worried that this would somehow lose the information that this brick is a replica for another brick
16:09 aliguori joined #gluster
16:09 Norky wrong window, sorry
16:09 aliguori_ joined #gluster
16:09 tqrst (this is a 20x2 setup)
16:14 gbr I may have to switch back to DRBD and iSCSI instead of Gluster and NFS.  Things were stable from June to November.  November 1st flakiness started and now I can't seem to stabilize things.
16:19 toruonu JoeJulian: I decided to write down the XFS vs EXT4 experience we had this autumn: http://toruonu.blogspot.co​m/2012/12/xfs-vs-ext4.html
16:19 glusterbot <http://goo.gl/5d4Jg> (at toruonu.blogspot.com)
16:19 wushudoin joined #gluster
16:20 inodb_ joined #gluster
16:22 H__ JoeJulian: Here's the gsyncd log of the symptom, it walks the tree, then crashes, then starts again and so on http://pastebin.com/SdjZ21Fi
16:22 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
16:23 nhm toruonu: interesting
16:23 H__ JoeJulian: here's the same at fpaste http://fpaste.org/8QZp/
16:23 glusterbot Title: Viewing gsyncd crash by Hans (at fpaste.org)
16:24 daMaestro joined #gluster
16:25 nhm toruonu: thanks for posting that.
16:33 gbr I host virtual machines on Gluster via NFS.  On NFS failover, my linux VM's make their drive RO, or lock up.  Is there a way to chnage the timeout value for Linux so it waits for the failover to occur, and then continues?
16:41 toruonu so is there a plan to fix this in 3.3.2 I'd actually like to start rolling out a wider deployment of glusterfs and right now it's kind of limited to nodes that have an older kernel...
16:45 chirino_m joined #gluster
16:46 z00dax_ joined #gluster
16:51 obryan joined #gluster
16:51 atoponce joined #gluster
16:52 __Bryan joined #gluster
16:52 redsolar_office joined #gluster
16:53 * johnmark HATE SPAM
16:53 * johnmark SMASH
16:54 carrar Just change your MX record to 127.0.0.1
16:54 carrar problem solved!!
16:55 robo joined #gluster
16:56 Rammses joined #gluster
16:57 Rammses anyone can help a noob
16:57 Rammses ?
16:57 olisch depends on your question ;)
16:57 Rammses it is about performance nothing hard ?
16:58 olisch just explain, someone will surely be able to help
16:58 plarsen joined #gluster
16:59 toruonu performance is always hard :)
16:59 Rammses ok, the plan is to deploy a gfs cluster to serve large files to a couple of studio, vissual effects guys over SMB/CIFS
17:00 Rammses what need to be achieved is 400 mb/sec sequential write
17:01 Rammses that is at least a 4 gig network access but they have 10 gig infrastructure already
17:01 toruonu what's your underlying storage and how many streams
17:02 Rammses the must is only one stream , storage is like 1 file 1 TB
17:04 Rammses and i've got plenty of pc's like 20
17:05 toruonu no no underlying storage as in what kind of storage architecture… what kind of drives, their controllers all the way to nic as well as what kind of capacity the node has that serves the bricks
17:05 toruonu if you have 5400 rpm drives underneath and plan to serve from 5 3TB drives, then it's basically impossible to reach :)
17:05 toruonu or well almost :)
17:06 saz joined #gluster
17:06 toruonu the gluster guys can say if there are any gluster related issues, but basically for that kind of speed I assume you need striping. If you need redundancy then it's distribtued stripe you're looking for
17:06 olisch you want to use the storage of your 20 pcs or should the 20pcs be your clients accessing the files?
17:06 Rammses i can use 20 pcs as cluster members, storage
17:06 toruonu but the gluster guys have to say if there is any downside to speed gains, I remember someone pointed me to a gluster related question on wether I really needed distributed stripe as there was some bottleneck
17:06 toruonu glusterbot can probably link it if I only remmebered the keyword
17:07 olisch striped volume would be best to reach your goal, but smb/cifs would be a problem
17:07 Rammses and adding drives is not an issue
17:08 Rammses i have to use windows7 on the clients
17:08 Rammses although there is a nfs client for win7
17:08 Rammses i have lot's of drives :)
17:08 toruonu what kind of drives though that probably isn't too relevant anymore. 20 nodes means at least 20 drives if you do a 2x replication for redundancy, then you are left with 10 drives for speed that's 40MB/s which for sequential write should be doable
17:08 toruonu olisch: would cifs/smb have an issue if the write is 1TB / file as a single stream write?
17:08 toruonu how much real overhead isthere?
17:09 olisch for smb/cifs you have to mount the volume on one node and then reexport ist with smb. i think you will never get such performance with crappy smb
17:09 olisch smb itsefl has always been a performance issue
17:09 Rammses nfs ?
17:11 toruonu well … I have lots of drives :P our storage consists of 1050 drives :D all 3TB :)
17:11 toruonu ok, well most … there are a few hundred 2TB as well :p
17:11 toruonu anyway …
17:11 toruonu network mount indeed is a point too
17:11 toruonu if you ran it all through 10G, then probably a bit less of an issue
17:12 toruonu but yes I'd assume NFS is a better choice
17:12 toruonu though I have no clue how NFS on Windows behaves
17:12 Rammses dude you are the everst of drives i got it
17:13 rudimeyer joined #gluster
17:13 y4m4 joined #gluster
17:13 Rammses so it is doable with 20 nodes and additional 2 tb drives using replication as backup right ?
17:15 nhm toruonu: ooh, are you running gluster on that?
17:16 toruonu nhm: nope, I only got gluster friendly a week ago
17:16 toruonu Hadoop is the one we use now
17:16 nhm toruonu: what are you using now?
17:16 nhm ah
17:18 nhm toruonu: it'll be interesting to hear about your experiences.  That's a pretty big deployment.
17:20 toruonu Rammses: it should be doable, but I have to say I can't say what the mount protocol limitations are … indeed if you have nodes to spare I'd test and see how it scales
17:20 toruonu test local single stream writes to disk and don't expect to get more than 80% or so to leave some margin
17:21 nightwalk joined #gluster
17:21 toruonu nhm: am planning to run gluster and hadoop in parallel using the same bricks, but orthogonal directories. I've already checked that it's supported, the only issue right now is that majority of the nodes run a new kernel and hence the ext4 bug might interfere
17:21 toruonu I can easily set up 15 servers with ca 900TB as those are running an older kernel and it's easy enough… but I'll have to see when I get time
17:21 toruonu I'll also have to think if it's viable to use some striping there to get enhanced performance… maybe 5x2 then I can increase in chunks of 10 drives
17:24 toruonu Rammses: found the link I was looking for http://joejulian.name/blog/sho​uld-i-use-stripe-on-glusterfs/
17:24 toruonu but overall reading through it your use case seems to be something that fits
17:24 glusterbot <http://goo.gl/5ohqd> (at joejulian.name)
17:24 toruonu glusterbot's slow today … :P
17:25 mooperd anyone know scality?
17:27 Rammses tourunu : reading the page
17:30 Mo_ joined #gluster
17:31 Rammses i agree that it fits my case too
17:31 Rammses i will give it a try and share the results with you
17:32 Rammses next week i am going to have time to test
17:32 Rammses by the way i've missed the irc
17:33 * johnmark notices a lot of new users coming into the mailing list late
17:33 johnmark ly
17:33 johnmark I wonder what's driving it
17:33 tqrst johnmark: in my case, it was an annoyance at community.gluster.org's editing UI
17:34 bennyturns joined #gluster
17:34 tqrst (and the lack of activity on it)
17:35 tqrst that UI just gets unusable once you get past 20-30 lines of text
17:35 mohankumar joined #gluster
17:35 nightwalk joined #gluster
17:42 johnmark tqrst: yeah, we have to change that
17:42 johnmark and by change, I mean get rid of it and replace with something like shapado
17:42 tqrst johnmark: how about a gluster tag on serverfault instead?
17:42 johnmark tqrst: not a bad idea
17:43 tqrst johnmark: I was going to write a few UI suggestions, but in the end they could be summarized by "make it more like stack overflow" :p
17:49 johnmark tqrst: ha, yeah. I definitely hear that
17:53 chandank joined #gluster
17:54 chandank today I sent email question to gluster mailing list and got this error "- Results:
17:54 chandank Ignoring non-text/plain MIME parts". I am using simple gmail to send emails
17:57 Daxxial_ joined #gluster
18:05 stre10k joined #gluster
18:11 stre10k I have simple 1 brick volume. The find command on 62000 files took 2 minutes. There is perf stats http://dpaste.org/zhYXS/. Is this normal? LOOKUP take more than 1 second.
18:20 rwheeler joined #gluster
18:21 chandank Hello, has anyone tried running Linux KVM based virtual machines on Gluster on production? Or could anyone suggest about its performance?
18:23 toruonu stre10k: find of 62000 files is probably normal to take 2 minutes
18:25 toruonu stre10k: http://fpaste.org/5E9Z/
18:25 glusterbot Title: Viewing [mario@ied Fall11]$ time find dataca ... datacards/ |wc -l 41711 real 1m8.08 ... (at fpaste.org)
18:27 toruonu and I have lookup unhashed turned off … doubt there's much more you can do
18:27 toruonu though find shouldn't hit that to be fair, sorry for confusion
18:28 toruonu but if you do find out how to make this perform faster on either tuning gluster or nfs side of it, do let me know
18:31 stre10k toruonu: ok, thank you
18:32 nueces joined #gluster
18:43 nightwalk joined #gluster
18:46 tqrst if I'm changing the mount point of a brick, is the following sufficient? gluster remove-brick $server:$brickmountpoint; umount $brickmountpoint; mount /dev/foobar $newmountpoint; gluster add-brick $server:$newmountpoint. My main concern is gluster forgetting which replica set the brick was part of.
18:47 bauruine joined #gluster
18:47 elyograg tqrst: off the cuff response with nothing to back it up: I would think you'd want to do replace-brick instead off remove/add.
18:49 tqrst elyograg: not sure how it would react to both mount points pointing to the same partition, though
18:55 Bullardo joined #gluster
18:56 JoeJulian H__: It says right there that it's OOM. :( I haven't looked at that python code to see if anything can be done about that.
19:00 andreask joined #gluster
19:03 GLHMarmot joined #gluster
19:04 JoeJulian chandank: Yes, many people are running kvm images on gluster volume, myself included. You're not going to get local disk performance on the image but you can do what I do which is to mount a gluster volume within your vm that holds the data you're using.
19:05 JoeJulian chandank: If you're willing to experiment, you could use the ,,(qa releases) and the new direct volume support in qemu-kvm.
19:05 glusterbot chandank: The QA releases are available at http://bits.gluster.com/pub/gluster/glusterfs/ -- RPMs in the version folders and source archives for all versions under src/
19:06 chandank sounds cool. Basically currently I am using DRBD for my production VMs where the vm runs on primary DRBD resource.
19:07 chandank Any idea it's about gluster performance compared to DRBD one?
19:07 JoeJulian tqrst: What I would do for your brick change is to kill the brick process for the brick you're moving, change the mount, then do a replace-brick.
19:07 nightwalk joined #gluster
19:07 JoeJulian chandank: I have a certain loathing for DRBD, so I couldnt' really say.
19:08 tqrst JoeJulian: change the mount as in umount the original mount point too? I thought replace-brick needed both points to be available
19:08 JoeJulian tqrst: No, you can just to a replace-brick ... force
19:09 chandank lol, that is fine. I will experiment. Would you recommend the latest gluster such as http://bits.gluster.com/pub/​gluster/glusterfs/3.4.0qa4/
19:09 glusterbot <http://goo.gl/Sakme> (at bits.gluster.com)
19:09 JoeJulian Since all the xattrs are still good, it should be a pretty quick and easy process.
19:09 JoeJulian chandank: Yep, that's the one to try. Remember, of course, it's pre-release to please report any bugs you find.
19:10 chandank yeah, sure.
19:11 stre10k left #gluster
19:11 tqrst JoeJulian: I'll try on one brick and see how it goes
19:13 JoeJulian tqrst: I'll be doing the same thing so let me know if you have any problems. I'd be surprised if there's any problems though. I've done that between servers, doing it locally should be even more simple.
19:14 tqrst JoeJulian: yeah, my main concern was with replace-brick trying to store some migration state on the partition, but that should be fine if I umount the original mount point first
19:14 JoeJulian kill the brick process and there's no interface to the brick.
19:15 JoeJulian @processes
19:15 glusterbot JoeJulian: the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more information.
19:15 * tqrst nods
19:16 trooney joined #gluster
19:17 trooney Our sysadmin deployed some apps onto a glustefs system. The things chugging 'coz the framework (symfony2) uses tons of small files... Are there any good docs for high small read installs?
19:18 ctria joined #gluster
19:21 JoeJulian Just my ,,(php) article.
19:21 glusterbot php calls the stat() system call for every include. This triggers a self-heal check which makes most php software slow as they include hundreds of small files. See http://goo.gl/uDFgg for details.
19:22 trooney JoeJulian: thanks!
19:22 nightwalk joined #gluster
19:26 trooney JoeJulian: That's a great overview. Thanks again for the link :)
19:26 trooney left #gluster
19:26 JoeJulian Hope it helps
19:36 nightwalk joined #gluster
19:38 Technicool joined #gluster
19:39 nueces joined #gluster
19:42 neofob joined #gluster
19:43 dalekurt joined #gluster
19:49 nick5 joined #gluster
20:01 rudimeyer_ joined #gluster
20:03 gbr What does this mean: [2012-12-03 12:55:38.424399] E [glusterfsd-mgmt.c:672:glus​terfs_handle_translator_op] 0-glusterfs: failed to unserialize req-buffer to dictionary
20:04 gbr How about this: [2012-12-03 14:04:23.075879] E [afr-self-heald.c:685:_link_inode_update_loc] 0-NFS_RAID6_FO-replicate-0: inode link failed on the inode (00000000-0000-0000-0000-000000000000)
20:05 gbr The first is from my gluster NFS server, and the second is from my replicate doing a self heal.
20:13 saz_ joined #gluster
20:18 tqrst JoeJulian: doesn't seem to want to happen without the brick being up: http://www.fpaste.org/CDPG/
20:18 glusterbot Title: Viewing Paste #257129 (at www.fpaste.org)
20:19 tqrst it seems to be expecting a clean slate
20:25 JoeJulian /mnt/donottouch/localb or a prefix of it is already part of a volume
20:25 glusterbot JoeJulian: To clear that error, follow the instructions at http://goo.gl/YUzrh or see this bug http://goo.gl/YZi8Y
20:26 JoeJulian ... though I don't think removing the .glusterfs directory is actually required. I'd try without doing that first.
20:26 dbruhn joined #gluster
20:26 nightwalk joined #gluster
20:26 dbruhn is RMDA and infiniband fully supported in 3.3.1 now?
20:28 JoeJulian My understanding is that it's as supported as it's ever been. The statement in the official docs that it's not is because it's not "supported" by Red Hat as part of RHS. It's considered in "Technical Preview" because they haven't put enough QA testing at it.
20:28 JoeJulian That said, ymmv
20:29 dbruhn I thought there was some standing issues with it in 3.3.0
20:30 tqrst JoeJulian: doesn't matter if trusted.afr.bigdata-client-N and trusted.glusterfs.dht are still set?
20:30 JoeJulian tqrst: That /should/ be a good thing.
20:30 aliguori joined #gluster
20:31 tqrst gah, glusterfsd relaunched itself
20:31 tqrst asdf;lkgjh
20:31 JoeJulian dbruhn: I thought so too, but when I spoke with the dev who works on rdma, that's the understanding I came away with.
20:32 dbruhn Awesome, so it shouldn't be an issue to set up if I get some new hardware. Just can't call red hat support to fix infiniband and rdma stuff
20:33 tqrst JoeJulian: seems to have worked
20:33 JoeJulian tqrst: cool
20:34 JoeJulian dbruhn: I've read mixed results, so I suspect there may be some driver or library implementation issue with some hardware. Again, not enough qa data to go on.
20:34 johnmark JoeJulian: ++
20:34 tqrst JoeJulian: I can finally move all my mount points to something less dangerous than /mnt/local{b,c,d}, when /mnt/local is used for local scratch space :p
20:34 johnmark dbruhn: I'm hesitant to say it will work, but I've seen reports of IB working
20:34 JoeJulian Hehe, yeah. That could be bad.
20:35 dbruhn Ahh, does anyone know any one running IB with 3.3.1 right now?
20:35 dbruhn I would love to wax intellectual a bit on it with them
20:35 JoeJulian I think m0zes does...
20:36 rwheeler joined #gluster
20:38 * m0zes is running 3.2.7 with IB.
20:38 dbruhn m0zes how well does with run for you?
20:39 m0zes spotty at times. it is *really* sensitive to IB route changes.
20:39 dbruhn does it improve performance drastically?
20:40 m0zes dbruhn: for single clients I was seeing 5X over 1GbE and 1.5x 10GbE
20:41 m0zes aggregate throughput was a little bit higher. ~1.2-1.3x
20:41 dbruhn I am assuming it improves the small file I/O operations
20:41 dbruhn ls commands, ect
20:42 m0zes I need to do more testing with it. I haven't mounted with rdma since before I discovered one of my IB switches was failing.
20:42 m0zes ~1-2 months.
20:43 m0zes I've just had too much to do as a single admin of a 2000 core HPC cluster.
20:43 dbruhn I suppose that doesn't help with things being sensitive?
20:43 dbruhn lol
20:43 nightwalk joined #gluster
20:54 tqrst JoeJulian: for some reason, when I change the bricks on a given server, its client dies for a bit and can't reconnect for ~10 seconds
20:54 tqrst seems to work fine after that, though
20:55 tqrst I also get "Connection failed. Please check if gluster daemon is operational" for the duration when doing things like volume status
20:55 Bullardo joined #gluster
20:58 johnmark m0zes: if I introduced you to our case study guy, would you be open to filling out his questionnaire/survey?
20:59 * johnmark likes the sound of 2000 core HPC cluster
20:59 Bullardo joined #gluster
21:02 m0zes johnmark: sure I'd fill out a survey :)
21:04 tqrst wc /var/log/gluster/mnt-bigdata.log -> 9841579
21:04 tqrst D:
21:04 gbrand_ joined #gluster
21:05 nightwalk joined #gluster
21:09 jbrooks joined #gluster
21:09 genewitch Ugh the second node didn't have the drives mounted when i set up the bricks
21:09 genewitch Stupid auto-init scripts
21:13 DaveS_ joined #gluster
21:19 johnmark m0zes: cool :)
21:22 tqrst is there a way to start glusterd but prevent it from launching glusterfsd?
21:23 genewitch tqrst: why
21:23 genewitch that's how it talks to the other bricks
21:23 tqrst genewitch: I'm in the middle of moving a brick
21:24 tqrst genewitch: glusterd decided to crash, but I need it to be up to run replace-brick
21:24 genewitch tqrst: i just started using gluster last week, i just thought it needed to be able to talk to the other bricks
21:25 genewitch don't the peers need to know that you're replacing a brick on that node?
21:25 genewitch i guess you could just block the brick ports?
21:27 Bullardo_ joined #gluster
21:29 Bullardo joined #gluster
21:30 JoeJulian That's a good idea.
21:31 tqrst genewitch: I thought that was glusterd's job, not glusterfsd
21:31 tqrst JoeJulian: replace-brick decided to start failing halfway through my renaming :\
21:32 JoeJulian :(
21:34 lh joined #gluster
21:35 tqrst and bug 847821 made my logs 10 million lines long, so it's a bit of a pain to grep through
21:35 glusterbot Bug http://goo.gl/gJor4 low, medium, ---, rabhat, ASSIGNED , After disabling NFS the message "0-transport: disconnecting now" keeps appearing in the logs
21:35 Bullardo_ joined #gluster
21:38 tqrst (plus a whole bunch of split-brain errors)
21:41 tqrst logs here http://www.fpaste.org/yL3E/
21:41 glusterbot Title: Viewing le sigh (at www.fpaste.org)
21:42 tqrst that first /mnt/donottouch/localc should read localb
21:45 tryggvil_ joined #gluster
21:45 tryggvil joined #gluster
21:52 H__ JoeJulian: ok, thanks for looking at the gsyncd paste. as it's Python code maybe I can try it from a newer gluster version ?
21:55 tqrst does gluster cache dns info? I'm seeing a whole lot of dns requests for my gluster servers every second or two...
21:55 Technicool tqrst, it does not
21:55 tqrst Technicool: wat
21:55 tqrst can I make it?
21:55 nightwalk joined #gluster
21:56 Technicool although jdarcy of Gluster fame and awesomeness did something akin to this with a xlator for caching negative lookups for PHP
21:56 Technicool tqrst, yes ^^   ;)
21:56 tqrst Technicool: those dns requests are eating up enough bandwidth for the network guys to drop into my office :p
21:57 Technicool https://github.com/jdarcy/negati​ve-lookup/blob/master/negative.c
21:57 glusterbot <http://goo.gl/xB2Wl> (at github.com)
21:57 JoeJulian seriously? I hadn't looked at that. You could use /etc/hosts
21:58 Technicool JoeJulian +1 for properly mentioning probably the only place to ever use /etc/hosts
21:58 JoeJulian I know, right?
21:58 Technicool you can also add an entry for the gluster nodes on the DNS host(s)
21:58 * JoeJulian starts monitoring his dns bandwidth...
21:58 tqrst JoeJulian: tshark -t a | grep -i dns should show you plenty of noise to look at
21:58 JoeJulian Technicool: The clients need dns entries?
21:59 Technicool JoeJulian, just mean entries to/from any node that is spamming the requests
22:00 JoeJulian That made what you're saying less clear.
22:00 Technicool JoeJulian, that happens a lot with me
22:00 inodb joined #gluster
22:00 JoeJulian So a client is spamming for dns entries... how would adding a dns entry for that client squelch that?
22:01 dalekurt joined #gluster
22:01 Technicool JoeJulian,this will either help or make things worse: if a client is spamming lookups, add an entry to the DNS host on the client, and an entry for the client on the DNS node
22:01 JoeJulian ('course it's not really spamming... it's just once every 3 seconds per server)
22:01 Technicool doesn't squelch it, just makes the lookups faster since they are coming from an entry at worst on disk and at best already in RAM
22:01 tqrst once every 3 seconds per server per client
22:02 Technicool people are coming into your office over dns requests every 3 seconds?
22:02 JoeJulian Oh! Add a dns server to the client...
22:02 JoeJulian Sure, if you have 6000 clients!
22:03 Technicool even with 6000 clients, that doesn't seem like enough traffic to warrant anyone caring...but then, everything is subject to YMMV
22:03 * JoeJulian thinks that gluster should honor dns timeouts.
22:04 inodb joined #gluster
22:04 JoeJulian 6000 clients, 200 servers, that's 400,000 queries/second.
22:05 * tqrst thinks this qualifies as a bug
22:05 JoeJulian Hey glusterbot, how can he file a bug report?
22:05 glusterbot http://goo.gl/UUuCq
22:06 JoeJulian tqrst: You could install a caching nameserver on the clients. That would reduce the traffic and shouldn't add much overhead to the clients.
22:07 tqrst redhat's bugzilla's search "product" thingamajiggy could sure use a search feature itself
22:08 tqrst which component should I use for this?
22:08 johnmark tqrst: glusterfs community
22:08 TSM2 joined #gluster
22:09 tqrst johnmark: I meant component in the bug report
22:09 johnmark oh. heh sorry
22:09 tqrst there's no 3.3.1 version entry btw, only 3.3.0
22:10 johnmark tqrst: ah, good to know
22:10 tqrst I'll just put in 3.3.0 for now
22:10 JoeJulian Yeah, I asked about that months ago... :/
22:10 tqrst Steps to reproduce: ...run gluster? :p
22:16 Technicool JoeJulian, is there a reason to do the cachine nameserver versus adding the servers as /etc/hosts entries on the clients?
22:16 Technicool caching even
22:16 JoeJulian Easier to manage
22:17 JoeJulian Granted, if you're managing with puppet, it's not that hard, but still...
22:18 Technicool given the amount of times I have seen uberphail on someone forgetting about an /etc/hosts entry, i'll go with you on it
22:19 tqrst testing with /etc/hosts right now just to see
22:19 a2 JoeJulian, i believe getaddrinfo() has logic to retry DNS timeouts?
22:20 JoeJulian I thought so too
22:20 * a2 checks
22:20 elyograg hosts is an ok thing if you've got a very small number of machines.  As soon as you need more than one hand to count them, it's probably time to use DNS.  I'd put them in DNS anyway, even if I was using hosts.
22:20 tqrst yeah, if this works I'll put in a dns server for this set of machines instead of flooding IT's ns two floors down
22:21 JoeJulian Using puppet and exported resources, it wouldn't be hard at all to make a /etc/hosts that has all your machines, but that just sounds dirty.
22:21 Technicool elyograg, agreed, but the specific issue here is to avoid the DNS lookups
22:21 nightwalk joined #gluster
22:22 Technicool so you still want them in DNS
22:22 Technicool you just don't want all the clients spamming
22:22 Technicool or semi-spamming, depending on your definition ;)
22:24 elyograg if you've got a reasonable TTL, I wouldn't expect the spam storm to be continuous.  local dns caches as has already been suggested would make that even less of a problem.
22:26 elyograg I wonder if there is any sort of "randomization" available on TTLs, either by the server or by the client.  make each cached value wait a few seconds shorter or longer than the given TTL, and over time the requests would spread out.
22:27 Bullardo joined #gluster
22:29 tqrst joined #gluster
22:29 tqrst joined #gluster
22:29 DaveS_ joined #gluster
22:30 a2 JoeJulian, there is logic in glibc to retry on timeout
22:30 a2 *retry dns query
22:31 JoeJulian So tqrst need to look at his dns timeouts.
22:31 Bullardo joined #gluster
22:31 a2 but a dns failure will only push out reconnection to the next iteration after 3 secs
22:32 a2 it shouldn't be a "permanent" error in glusterfs
22:32 tqrst sorry - got disconnected for a bit
22:32 JoeJulian Ah, we're talking different terms for timeout. I'm referring to the cache time that the dns entry should remain valid for, not the request timing out.
22:33 jdarcy I use /etc/hosts all the time, because I don't control our lab DNS and the test-machine names are freaking ridiculous.
22:34 JoeJulian "fubar1 IN A 3600 1.2.3.4" should only be re-checked once/hour.
22:34 jdarcy I guess I could set up my own DNS server and point /etc/resolv.conf to that.
22:35 elyograg JoeJulian: i believe that bind calls that the TTL, or time to live.
22:36 JoeJulian Pfft... look at you and your insistence on using the correct terms. ;)
22:36 * JoeJulian hides his hypocrisy.
22:36 elyograg heh.\
22:37 tqrst we should just use 'gluster' for everything
22:37 tqrst I glustered the gluster and then it glustered down
22:37 JoeJulian That sounds gluster.
22:37 tqrst so am I the only one getting all those dns requests?
22:37 Technicool thats sounds smurfing glustastic
22:37 JoeJulian I'm not getting any of your dns requests.
22:40 JoeJulian I'm not seeing that problem, no.
22:40 elyograg my ttl randomization idea, given an hour for the ttl and a one percent randomization factor, would result in the clients waiting between 3564 and 3636 seconds.  With the example of 6000 clients and 200 servers, if you started all the clients at the exact same moment, you'd have an initial flood of 400,000 requests, but after a few hours, they would start arriving a few at a time instead of all at once.
22:40 JoeJulian My brick server hostnames have 5 minute ttl and that seems to be working correctly.
22:42 elyograg actually I guess the math there works out to 1.2 million.  yikes!
22:42 JoeJulian Yeah, the 400k/s was with polling at 3 second intervals.
22:44 nightwalk joined #gluster
22:44 GLHMarmot joined #gluster
22:45 tqrst dig says it should be 10 minutes
22:57 tqrst interestingly enough, starting nscd doesn't help (as opposed to stuffing everything into /etc/hosts)
22:58 JoeJulian What version of gluster are you running?
22:58 tqrst 3.3.1
22:59 JoeJulian I think I'd look at my packet captures in wireshark and make sure that dns is working the way you think it is.
23:07 puebele joined #gluster
23:14 nightwalk joined #gluster
23:20 mooperd Anyone here heard of scality?
23:21 a2 yeah
23:29 designbybeck joined #gluster
23:30 nightwalk joined #gluster
23:36 genewitch if you're geo-replicated can you write to either set of bricks? and read from either set? for HA?
23:36 JoeJulian geo-replication is currently unidirectional.
23:37 JoeJulian from master to slave
23:37 genewitch so is there documentation for uncoupling and using the backup set of bricks temporarily?
23:37 genewitch and geo-replication doesn't care about the NAT that exists on the cloud, does it? my co-worker said that it syncs through ssh
23:39 plarsen joined #gluster
23:40 genewitch or promoting the slaves to master, i guess
23:42 JoeJulian There is not. And yes, it does a targeted rsync over ssh.
23:49 nueces joined #gluster
23:56 H__ joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary