Camelia, the Perl 6 bug

IRC log for #gluster, 2012-12-05

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:04 nightwalk joined #gluster
00:06 JoeJulian If you could specify which address to listen on, you could just add ip addresses to add another 64k ports.
00:07 JoeJulian I suppose you could use virtualization to do that.
00:08 semiosis JoeJulian: multi-homing, maybe?
00:09 semiosis if you partitioned your servers into different ip subnets and added alias addresses to your client ethernet interface for each of the ip subnets maybe things would naturally bind to the respective address
00:09 semiosis that's a /16 for each server subnet
00:10 JoeJulian except the process listens on 0.0.0.0
00:10 semiosis JoeJulian: nope, client makes outbount connections to servers that are listening
00:11 semiosis i think that would work... noob2, a2
00:12 JoeJulian Oh, I was looking at the max number of bricks per server.
00:12 semiosis ah, never mind the /16 thing, thats irrelevant
00:13 JoeJulian Theoretically, subnetting should work, as long as there's not some other kernel limit we're not aware of.
00:13 semiosis open files, but i think that can be increased pretty significantly
00:13 JoeJulian I thought about that as I was writing up the math for that brontobyte calculation.
00:15 nightwalk joined #gluster
00:26 kevein joined #gluster
00:33 dalekurt joined #gluster
00:53 jiffe1 is mounting a gluster filesystem and then re-exporting that filesystem with nfs not a good way to go?
00:54 jiffe1 I'm doing this so that I can take advantage of gluster's failover but seem to be running into nfs issues this way
00:56 Technicool jiffe, it's not uncommon
00:57 Technicool not sure if the failover would work in your case, moment while i read slower-ly
00:58 Technicool the failover should word in that case, what nfs issues are you seeing?
01:00 yinyin joined #gluster
01:00 jiffe1 seeing lots of messages in my kern.log like 'fsid 0:16: expected fileid 0xbbb801eb762e8272, got 0x85ffe21c9e61cf23' and 'NFS: server 127.0.0.1 error: fileid changed' and just ran into a stale nfs file handle
01:01 Technicool are you serving VM's?
01:03 manik joined #gluster
01:06 blendedbychris1 joined #gluster
01:08 jiffe1 the web server running the gluster client/nfs is on a virtual machine, the gluster servers are not
01:10 jiffe1 oh I see, no I am not serving virtual machines off of this
01:10 jiffe1 it is just web conent
01:11 jiffe1 running nfs like this the content seems to be pretty snappy, just run into these kinds of issues occasionally
01:18 suehle joined #gluster
01:24 kevein joined #gluster
01:30 yinyin joined #gluster
01:57 gbr I installed semiosis gluster 3.3.1 deb for Ubuntu.  Running 'stop glusterfs-server' doesn't actually stop glusterfs.  Anyone else seen this?
01:58 yinyin joined #gluster
01:59 Technicool gbr, thats expected i think
01:59 Technicool the gluster server is simply for management, it is separate from data transfer so that you don't lose data if you lose management
02:01 pdurbin left #gluster
02:11 sunus joined #gluster
02:11 gbr Technicool: OK, thanks.  I got access to the base file system by doing a 'gluster stop'.  I needed to unmount and do an fsck.
02:13 Technicool gbr, gluster volume <foo> stop is the way to stop the fs, yes
02:27 lng joined #gluster
02:27 JoeJulian ... though, if you're running a replicated volume and feel you have a valid replica state, you /can/ just kill the glusterfsd ,,(processes)
02:27 glusterbot the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more information.
02:28 lng JoeJulian: Hello! Seams like Replicated Gluster is troublesome with small files on EC2...
02:29 JoeJulian I'm so sick of that phrase...
02:29 lng which one? mine?
02:29 JoeJulian http://www.joejulian.name/blog/nfs​-mount-for-glusterfs-gives-better-​read-performance-for-small-files/
02:29 glusterbot <http://goo.gl/nDnxh> (at www.joejulian.name)
02:29 JoeJulian "small files"
02:30 lng JoeJulian: I'm about split-brain
02:30 Technicool is it...bigger than a breadbox?
02:30 lng JoeJulian: a lot of corrupted files
02:30 JoeJulian I was split-brain about an hour ago... I'm going home.
02:30 JoeJulian Oh! You meant your filesystem.
02:30 JoeJulian I meant my actual brain...
02:30 lng :)
02:31 lng I understood
02:31 lng maybe I need to remove replicas
02:31 lng but doing so will make it hard to scale
02:31 JoeJulian Do you know /why/ you're split-brain?
02:31 lng me?
02:31 JoeJulian yeah
02:32 JoeJulian We'll come back to the "hard to scale" question after I get home.
02:33 lng JoeJulian: if it's clustered, there is no downtime when you scale up/down your nodes
02:34 lng JoeJulian: the think I see a lot of errors when I execute healing
02:34 lng I can see a lot of errors in the application as well because it could not write to locked files
02:34 JoeJulian but do you know /how/ your cluster got into that state? You /should/ because there's pretty specific patterns that cause that.
02:35 lng JoeJulian: network partition
02:35 lng aond/or
02:35 lng and*
02:35 lng scaling
02:35 * m0zes is always reminded of people trying to open multiple r/w sqlite3 apps on the same sqlite3 db...
02:36 * m0zes has silly users.
02:36 lng after I restarted the nodes I got many split-brain errors
02:38 m0zes ,,(split-brain)
02:38 glusterbot (#1) learn how to cause split-brain here: http://goo.gl/nywzC, or (#2) To heal split-brain in 3.3, see http://goo.gl/FPFUX .
02:38 lng m0zes: I do it evry day
02:39 lng m0zes: how do you heal it if you have gfid instead of file?
02:39 lng in case of split brain
02:40 lng the last time I just deleted those entries by gfids in .glusterfs
02:40 lng but actual files were left intact
02:40 m0zes I honestly don't know. I am still running 3.2.7, so I have no self heal daemon and the logs /I/ see never just reference gfid.
02:41 lng because searching for them by inode is extremely slow
02:42 lng how can I flush the info of `gluster volume heal storage info heal-failed`?
02:42 lng how do I know when heal procedure copleted?
02:46 lng oh! today is the first time I see no split-brain files in info split-brain
02:46 lng 'Number of entries: 0' for all 8 bricks!
02:49 lng but have a lot of entries by 'info heal-faile'
02:49 lng gfids
02:49 lng what does it mean?
02:52 lng I also have files like these on the mout: ?????????? ? ?        ?           ?            ? game.dat2012111611
02:54 stre10k joined #gluster
02:54 stre10k joined #gluster
02:59 bharata joined #gluster
03:07 saz joined #gluster
03:49 __Bryan__ joined #gluster
04:03 sripathi joined #gluster
04:13 yinyin joined #gluster
04:15 mohankumar joined #gluster
04:24 hagarth joined #gluster
04:33 hagarth left #gluster
04:33 hagarth joined #gluster
04:42 nightwalk joined #gluster
04:44 sripathi1 joined #gluster
04:49 deepakcs joined #gluster
04:54 sgowda joined #gluster
05:03 yinyin joined #gluster
05:10 avati joined #gluster
05:21 vpshastry joined #gluster
05:27 bala joined #gluster
05:27 Bullardo joined #gluster
05:35 rastar joined #gluster
05:39 bulde joined #gluster
05:41 sgowda joined #gluster
05:43 harshpb joined #gluster
06:01 raghu joined #gluster
06:02 puebele joined #gluster
06:15 unlocksmith joined #gluster
06:22 Bullardo joined #gluster
06:22 vikumar joined #gluster
06:23 efries joined #gluster
06:23 harshpb joined #gluster
06:24 hagarth joined #gluster
06:24 ankit9 joined #gluster
06:35 vpshastry joined #gluster
06:35 primusinterpares joined #gluster
06:46 mohankumar joined #gluster
06:49 ngoswami joined #gluster
07:00 primusinterpares joined #gluster
07:03 guigui3 joined #gluster
07:12 sripathi joined #gluster
07:20 ramkrsna joined #gluster
07:20 ramkrsna joined #gluster
07:29 inodb^ joined #gluster
07:30 glusterbot New news from resolvedglusterbugs: [Bug 862332] migrated data with "remove-brick start" unavailable until commit <http://goo.gl/gYZ6p>
07:31 sripathi joined #gluster
07:37 unlocksmith left #gluster
07:39 unlocksmith joined #gluster
07:39 inodb_ joined #gluster
07:41 bharata joined #gluster
07:41 the-me joined #gluster
07:46 ekuric joined #gluster
08:10 tjikkun_work joined #gluster
08:13 ctria joined #gluster
08:15 dobber joined #gluster
08:18 manik joined #gluster
08:18 ramkrsna_ joined #gluster
08:18 yinyin joined #gluster
08:22 bitsweat joined #gluster
08:22 lkoranda joined #gluster
08:24 morse joined #gluster
08:25 andreask joined #gluster
08:34 redsolar joined #gluster
08:35 SpeeR joined #gluster
08:37 sunus in NODE A(192.168.0.245): peer probe 192.168.0.233                    in NODE B(192.168.0.233): unable to find hostname 192.168.0.245  why?
08:37 sunus both NODE a,b can ping each other
08:37 gmcwhistler joined #gluster
08:43 ankit9 joined #gluster
08:50 tryggvil joined #gluster
08:50 tryggvil_ joined #gluster
08:58 JoeJulian ~hostnames | sunus
08:58 glusterbot sunus: Hostnames can be used instead of IPs for server (peer) addresses. To update an existing peer's address from IP to hostname, just probe it by name from any other peer. When creating a new pool, probe all other servers by name from the first, then probe the first by name from just one of the others.
08:59 tryggvil_ joined #gluster
08:59 tryggvil joined #gluster
09:03 shireesh joined #gluster
09:07 manik joined #gluster
09:18 rags_ joined #gluster
09:19 sgowda joined #gluster
09:31 DaveS_ joined #gluster
09:31 ngoswami joined #gluster
09:32 toruonu joined #gluster
09:37 copec joined #gluster
09:38 bauruine joined #gluster
09:58 guigui4 joined #gluster
09:59 Staples84 joined #gluster
10:00 olisch joined #gluster
10:06 manik joined #gluster
10:23 hagarth joined #gluster
10:25 deepakcs joined #gluster
10:37 tryggvil_ joined #gluster
10:37 tryggvil joined #gluster
10:42 mooperd joined #gluster
10:50 glusterbot New news from newglusterbugs: [Bug 883785] RFE: Make glusterfs work with FSCache tools <http://goo.gl/FLkUA>
10:50 chirino joined #gluster
10:52 rags_ joined #gluster
10:57 ankit9 joined #gluster
11:31 nightwalk joined #gluster
11:36 rags_ joined #gluster
11:43 tryggvil joined #gluster
11:43 tryggvil_ joined #gluster
11:51 glusterbot New news from newglusterbugs: [Bug 876214] Gluster "healed" but client gets i/o error on file. <http://goo.gl/eFkPQ>
11:54 vikumar joined #gluster
12:01 glusterbot New news from resolvedglusterbugs: [Bug 876222] Gluster "healed" but client gets i/o error on file. <http://goo.gl/hwxIg>
12:01 H__ The rebalance action after adding server B did not spread data evenly over servers A+B, what can I do to make the setup HA now ?
12:04 olisch1 joined #gluster
12:11 andreask joined #gluster
12:19 nueces joined #gluster
12:24 mohankumar joined #gluster
12:26 theron joined #gluster
12:42 13WAACKVG joined #gluster
12:42 3JTAABDOI joined #gluster
12:45 deepakcs joined #gluster
12:50 rastar left #gluster
12:53 tryggvil joined #gluster
13:07 vpshastry left #gluster
13:12 vpshastry joined #gluster
13:47 nightwalk joined #gluster
13:48 gbr First night in a week my XenServers haven't lost thier NFS (Gluster NFS) mounts.  Nice!  Upgraded to 3.3.1
13:50 mdarade1 joined #gluster
13:55 tqrst am I blind, or is there no way to clear the 'gluster volume heal MYVOL info split-brain' logs? I just fixed a few split brained files, yet I'm still seeing many entries for them that date back from yesterday. It makes it hard to tell what is left to clean up.
13:55 hagarth joined #gluster
13:58 aliguori joined #gluster
14:00 mohankumar joined #gluster
14:06 vpshastry left #gluster
14:16 wN joined #gluster
14:20 mohankumar joined #gluster
14:28 bennyturns joined #gluster
14:32 Nr18 joined #gluster
14:35 chirino joined #gluster
14:36 noob2 joined #gluster
14:42 mdarade1 left #gluster
14:52 __Bryan__ joined #gluster
14:52 spn joined #gluster
14:58 aliguori joined #gluster
14:59 mdarade1 joined #gluster
15:04 stopbit joined #gluster
15:04 spn joined #gluster
15:19 nightwalk joined #gluster
15:20 obryan joined #gluster
15:22 puebele joined #gluster
15:24 semiosis :O
15:24 obryan left #gluster
15:28 wushudoin joined #gluster
15:30 gbr joined #gluster
15:31 gbr tqrst: Did you get an answer to the split brain logs?
15:39 raghu joined #gluster
15:41 wushudoin joined #gluster
15:41 khushildep joined #gluster
15:41 flakrat joined #gluster
15:52 nightwalk joined #gluster
16:00 daMaestro joined #gluster
16:01 sjoeboo joined #gluster
16:04 tryggvil joined #gluster
16:05 nueces joined #gluster
16:07 mooperd joined #gluster
16:10 jbrooks joined #gluster
16:21 tqrst gbr: no
16:22 tqrst gbr: I found a bug report that says restarting glusterd supposedly helps, but that only clears the log locally and then resyncs back to what it was after a bit
16:22 tqrst bug 86496
16:22 tqrst bug 864963
16:23 * tqrst prods glusterbot
16:23 glusterbot Bug http://goo.gl/r0qQK low, medium, ---, katzj, CLOSED NOTABUG, system does not boot after finish of kickstart install
16:23 glusterbot Bug http://goo.gl/8tcCO low, medium, ---, vsomyaju, ASSIGNED , Heal-failed and Split-brain messages are not cleared after resolution of issue
16:26 sameer joined #gluster
16:32 blendedbychris joined #gluster
16:32 blendedbychris joined #gluster
16:37 nueces joined #gluster
16:43 {p120d16y} joined #gluster
16:43 {p120d16y} left #gluster
16:58 ramkrsna_ joined #gluster
16:58 Nr18 joined #gluster
17:16 JoeJulian @reconnect
17:17 glusterbot joined #gluster
17:17 mohankumar joined #gluster
17:20 bambi2 joined #gluster
17:26 nullsign joined #gluster
17:29 morse joined #gluster
17:30 mdarade1 left #gluster
17:34 rbennacer joined #gluster
17:35 rbennacer anyone knows if glusterfs 3.3.1 is backward compatible with the gluster 3.3.0?
17:40 zaitcev joined #gluster
17:47 Mo___ joined #gluster
17:49 nightwalk joined #gluster
17:51 kkeithley rbennacer: yes, it is
17:55 avati joined #gluster
18:00 harshpb joined #gluster
18:01 quillo joined #gluster
18:02 khushildep joined #gluster
18:05 andreask left #gluster
18:16 gbr rbennacer: I had a replicate cluster (2 servers) running. 1 was 3.3.0 and the other 3.3.1.  The 3.3.0 server was having issues already, but they seemed to be exacerbated when 3.3.1 was placed on the other server.  Now that 3.3.1 is on both, things are (so far) more stable.  It could have been something other than the version mismatch though.
18:33 atrius joined #gluster
18:34 Nr18 joined #gluster
18:42 Nr18 joined #gluster
18:46 rbennacer gbr, i might have the same problem
18:48 dustint joined #gluster
18:50 ctria joined #gluster
18:52 zwu joined #gluster
18:57 bauruine joined #gluster
19:03 JoeJulian I ran across several issues with 3.3.0 and I think at least some of them were rpc related. Since the 3.3.0 bugs will still be there with mixed versions, you should still see the 3.3.0 bugs.
19:09 rbennacer JoeJulian, did you have different versions of glusterfs in each node?
19:14 schmidmt1 joined #gluster
19:15 schmidmt1 Has anyone had issues with two mounts to a gluster volume showing different files?
19:16 dan_a joined #gluster
19:17 JoeJulian rbennacer: For about a day, yes.
19:18 rbennacer JoeJulian, is it causing any problem?
19:18 AK6L left #gluster
19:18 rbennacer like directories or files not showing up
19:19 JoeJulian No, nothing like that. Just inabilities to change volume configuration, glusterd crashing during certain cli operations, stuff like that.
19:19 rbennacer mmm
19:20 rbennacer what are the different glusterfs version do you have on each node?
19:20 JoeJulian schmidmt1: That should only be possible if the clients aren't actually connecting to each server in the replica set. Check your client logs.
19:21 schmidmt1 I'll look there. Thanks.
19:22 JoeJulian rbennacer: Now, they're all 3.3.1. For for one day (while I was making the transition) two of my three servers were 3.3.1 and one was 3.3.0. The 20 or so clients were mixed as well to varying degrees as I was able to take them out of rotation and remount.
19:23 obryan joined #gluster
19:25 Technicool joined #gluster
19:28 jmara joined #gluster
19:40 schmidmt1 JoeJulian: Any idea what I should be looking for?
19:41 JoeJulian connections and disconnections seem a likely candidate... Make sure it's connecting to both (all?) your servers.
19:46 schmidmt1 I believe they're connected. The log doesn't make it clear if they are or not.
19:47 schmidmt1 I am able to transverse the share though
19:47 JoeJulian You could check with netstat too
19:47 schmidmt1 They are all connected
19:47 harshpb joined #gluster
19:48 schmidmt1 Is there anything else which could cause a file system splitting?
19:50 _ilbot joined #gluster
19:50 Topic for #gluster is now  Gluster Community - http://gluster.org | Q&A - http://community.gluster.org/ | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - http://irclog.perlgeek.de/gluster/
19:50 JoeJulian You are doing all those writes and file ops through a client mount, right? Not directly on the bricks?
19:50 schmidmt1 3.3.0-1 is our version on CentOS 6 boxes.
19:50 schmidmt1 All client operations.
19:50 JoeJulian selinux?
19:50 schmidmt1 Nope
19:51 JoeJulian I know that there's some nice crash bugs in 3.3.0-1, but those are mostly cli related.
19:52 JoeJulian Have you made any changes to the volume since you created it?
19:52 y4m4 joined #gluster
19:52 schmidmt1 I haven't had any issues with crashing thus far.
19:53 y4m4 joined #gluster
19:53 glusterbot New news from newglusterbugs: [Bug 884280] distributed volume - rebalance doesn't finish - getdents stuck in loop <http://goo.gl/s4xvj>
20:00 schmidmt1 JoeJulian: Thanks for helping out
20:07 JoeJulian did you find something?
20:09 schmidmt1 I have not, unfortunately.
20:10 schmidmt1 We'll try to consistently reproduce the issue. If we can then I'll have more info then.
20:42 daMaestro joined #gluster
20:45 mooperd_ joined #gluster
20:46 nightwalk joined #gluster
21:11 puebele joined #gluster
21:15 dalekurt joined #gluster
21:44 noob2 i did some iops analysis on my gluster last night.  turned out pretty good i think
21:45 noob2 iops=1125 from fio
21:45 noob2 i have 48 disks, 10Gb network and 2 replications
21:45 noob2 i mean replica=2
21:47 andreask joined #gluster
21:51 JoeJulian nice
21:52 chirino joined #gluster
21:52 a2 noob2, what release are you using?
21:53 noob2 um 3.3.0-6 i believe
21:57 JoeJulian 3.3.1 is (not sure why) faster on my system.
21:58 JoeJulian I think it's the selinux related changes.
21:58 noob2 interesting.  that'll be good when i upgrade :D
22:00 t35t0r joined #gluster
22:00 t35t0r joined #gluster
22:11 mooperd joined #gluster
22:12 puebele joined #gluster
22:28 tqrst is cluster.quorum-type=Auto safe to use in 3.3.1? It doesn't seem to be documented in the admin guide at all.
22:33 JoeJulian tqrst: Yes, it's on the ,,(options) wiki page.
22:33 glusterbot tqrst: http://goo.gl/dPFAf
22:33 JoeJulian jdarcy recently mentioned his need to get that into the cli help as well.
22:35 tqrst also found http://hekafs.org/index.php/​2011/11/quorum-enforcement/
22:35 glusterbot <http://goo.gl/cFQm3> (at hekafs.org)
22:35 tqrst thanks
22:35 elyograg that wiki page talks about cluster.quorum-type, but auto is not mentioned.
22:39 elyograg if i understand quorum properly, it is useless right now for replica 2.  What I think would be a good idea for quorum is to have other gluster instances serve as a "vote-only" member of the quorum, sorta like a standby node in corosync/pacemaker.
22:40 elyograg it wouldn't work if you only have two nodes, but if you've got no-brick peers for NFS/Samba/UFO access, or you've expanded beyond two peers, there would be instances for that.
22:40 JoeJulian Auto is replica count / 2 + 1, so yeah, for replica 2 it means that all must be available to the client or it won't be able to write.
22:41 JoeJulian jdarcy and a2 (avati) have talked many times about how to implement an observer... not sure where they are on that though.
22:42 elyograg JoeJulian: is there a BZ for it?  I'm not in a position right now to make one, but I could probably do so in a few hours.
22:42 JoeJulian @query observer
22:42 glusterbot JoeJulian: No results for "observer."
22:43 JoeJulian @query quorum
22:43 glusterbot JoeJulian: Bug http://goo.gl/ZEu0U high, unspecified, ---, pkarampu, ASSIGNED , Implement a server side quorum in glusterd
22:43 glusterbot JoeJulian: Bug http://goo.gl/f4Puw unspecified, unspecified, ---, jdarcy, VERIFIED , CHILD_UP/CHILD_DOWN behavior breaks quorum calculations
22:43 glusterbot JoeJulian: Bug http://goo.gl/4PzWe low, medium, ---, jdarcy, VERIFIED , Go read-only if quorum not met
22:45 JoeJulian Nope, doesn't look like it.
22:45 JoeJulian Probably should file a bug on the lack of cli help text too to remind jdarcy that it needs done.
22:45 glusterbot http://goo.gl/UUuCq
22:53 glusterbot New news from newglusterbugs: [Bug 884327] Need to achieve 100% code coverage for the utils.py module <http://goo.gl/T4vbT> || [Bug 884328] quorum needs cli help text <http://goo.gl/33ipV>
22:54 rbennacer left #gluster
22:57 nightwalk joined #gluster
23:08 TSM2 joined #gluster
23:10 Nr18 joined #gluster
23:10 morse joined #gluster
23:13 neofob joined #gluster
23:22 saz_ joined #gluster
23:45 gbr joined #gluster
23:46 nightwalk joined #gluster
23:49 hattenator joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary