Camelia, the Perl 6 bug

IRC log for #gluster, 2012-10-23

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 berend joined #gluster
00:07 benner joined #gluster
00:12 JoeJulian @mount
00:12 glusterbot JoeJulian: I do not know about 'mount', but I do know about these similar topics: 'If the mount server goes down will the cluster still be accessible?', 'mount server'
00:13 JoeJulian ~mount server | leejohn
00:13 glusterbot leejohn: (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrnds, or (#2) Learn more about the role played by the server specified on the mount command here: http://goo.gl/0EB1u
00:17 JoeJulian @later tell leejohn Sorry you couldn't wait long enough to get the answer. Say "@mount server" in channel to get your answer.
00:17 glusterbot JoeJulian: The operation succeeded.
00:32 ondergetekende joined #gluster
00:43 penglish Somehow I had not seen this: https://access.redhat.com/knowledge/node/66206
00:43 glusterbot Title: Red Hat Storage Server 2.0 Compatible Physical, Virtual Server and Client OS Platforms - Red Hat Customer PortalRed Hat Customer Portal (at access.redhat.com)
00:44 JoeJulian What?!?! You missed part of the internet?
01:25 bala2 joined #gluster
01:31 glusterbot New news from resolvedglusterbugs: [Bug 839768] firefox-10.0.4-1.el5_8-x86_64 hang when rendering pages on glusterfs client <https://bugzilla.redhat.com/show_bug.cgi?id=839768>
01:32 kevein joined #gluster
01:45 lng joined #gluster
01:48 lng Hi! After one of the replicated nodes (GlusterFS 3.3) was offline, self-heal doesn't correct file difference. Why?
01:56 lng I thought it should be done automatically...
01:56 lng how to sync a bricks?
01:59 aliguori joined #gluster
02:09 ika2810 joined #gluster
02:31 sunus joined #gluster
02:45 neofob joined #gluster
03:39 shylesh joined #gluster
03:54 sgowda joined #gluster
03:55 JoeJulian lng: What version?
03:56 vpshastry joined #gluster
04:00 lng JoeJulian: 3.0
04:00 lng I have updated my cluster after switching to hostnames...
04:01 JoeJulian updated? so 3.3.1 then, not 3.0?
04:01 lng they are different versions
04:02 lng the one not synced is 3.3.0
04:02 lng sorry
04:02 lng and live is 3.3.1
04:02 JoeJulian ok
04:03 JoeJulian check "gluster volume heal $volname info" to see if it shows anything pending
04:03 lng JoeJulian: okay
04:03 lng by cpu graph, I seen the growth after reboot
04:03 lng on both instances
04:04 lng probably they synced
04:04 lng that's why ther's spike
04:04 JoeJulian probably
04:05 lng JoeJulian: I used sed to substutite IPs in configs
04:05 lng is it okay?
04:05 * JoeJulian shrugs
04:05 JoeJulian if it works, then I guess so.
04:05 lng maybe after probbing and reboot it would change them automatically?
04:05 JoeJulian I've never tried it, myself.
04:05 lng I don't want to try on production :-)
04:06 lng I have test env for that
04:06 lng but I don't want to spend time on that
04:06 lng need to eat something...
04:06 lng bbl
04:47 sashko joined #gluster
04:48 sripathi joined #gluster
04:54 tru_tru joined #gluster
04:56 sunus joined #gluster
04:59 zhashuyu joined #gluster
05:06 mrkvm joined #gluster
05:16 bala1 joined #gluster
05:23 ramkrsna joined #gluster
05:23 ramkrsna joined #gluster
05:24 64MAB4P8S joined #gluster
05:26 samkottler|out joined #gluster
05:27 mohankumar joined #gluster
05:36 raghu joined #gluster
05:37 Teknix joined #gluster
05:39 hagarth joined #gluster
05:43 duerF joined #gluster
05:56 badone__ joined #gluster
06:13 sunus hi where can i found the fuse part of code in glusterfs?
06:14 sunus which files exactly?
06:15 hagarth joined #gluster
06:20 ramkrsna_ joined #gluster
06:20 mdarade joined #gluster
06:31 nightwalk joined #gluster
06:32 lkoranda joined #gluster
06:38 ramkrsna__ joined #gluster
06:39 lng In glusterfs 3.3.0, do I need to execute self-heal procedure after one of the nodes was offline for some time?
06:39 overclk joined #gluster
06:39 mdarade1 joined #gluster
06:40 lng or this is done automatically?
06:42 mrkvm joined #gluster
06:45 rferris joined #gluster
06:46 rferris left #gluster
06:48 deepakcs joined #gluster
06:53 hagarth joined #gluster
06:56 Nr18 joined #gluster
07:01 badone joined #gluster
07:02 ctria joined #gluster
07:04 lng `gluster volume heal storage info split-brain` > Segmentation fault (core dumped)
07:04 lng why is gluster so buggy?
07:07 dobber joined #gluster
07:11 sripathi joined #gluster
07:23 syoyo joined #gluster
07:24 andreask joined #gluster
07:26 TheHaven joined #gluster
07:30 nightwalk joined #gluster
07:32 Staples84 joined #gluster
07:42 pkoro joined #gluster
07:51 tru_tru joined #gluster
07:53 rkubany joined #gluster
07:53 tjikkun_work joined #gluster
07:55 Triade joined #gluster
07:56 puebele joined #gluster
07:59 tryggvil joined #gluster
08:06 guigui1 joined #gluster
08:26 hagarth joined #gluster
08:27 gbrand_ joined #gluster
08:27 sshaaf joined #gluster
08:29 ramkrsna_ joined #gluster
08:31 mdarade1 joined #gluster
08:32 Tarok joined #gluster
08:32 ramkrsna joined #gluster
08:32 ramkrsna joined #gluster
08:34 mohankumar joined #gluster
08:37 TheHaven joined #gluster
08:46 Tarok_ joined #gluster
08:51 sripathi joined #gluster
08:55 lng what does it mean? State: Peer Rejected (Connected)
08:59 puebele1 joined #gluster
09:02 ramkrsna_ joined #gluster
09:02 glusterbot New news from resolvedglusterbugs: [Bug 822083] [27ae1677eb2a6ed4a04bda0df5cc92f2780c11ed]: glusterfs client hangs, thus the application running on it <https://bugzilla.redhat.com/show_bug.cgi?id=822083>
09:03 mohankumar joined #gluster
09:04 mdarade2 joined #gluster
09:05 lng resolved after glusterd restarted
09:07 lng now I have another one: State: Accepted peer request (Connected)
09:15 manik joined #gluster
09:16 manik joined #gluster
09:24 sripathi joined #gluster
09:29 ramkrsna__ joined #gluster
09:30 mdarade1 joined #gluster
09:31 hagarth joined #gluster
09:40 sunus joined #gluster
09:45 duerF joined #gluster
09:45 ramkrsna_ joined #gluster
09:47 mdarade3 joined #gluster
09:52 VisionNL Hi, our clients are Scientific Linux 5 (compatible with Red Hat Linux 5) and our servers are RHEL6 running GlusterFS 3.2.5. We intend to upgrade to 3.3 and want to know if this might be a harmful upgrade. We are currently having about 100TB of data in our 240TB total capacity and would hopefully avoid taking a risk in the upgrade.
09:53 VisionNL Also, our SLC5 clients don't seem to be able to upgade beyond 3.2.6. Does anybody have experience with an upgrade to 3.3 on a Red Hat 5 (compatible) system?
09:59 lng what is /vols/[vol-name]/rbstate?
10:00 lng is it replace brick?
10:26 puebele joined #gluster
10:47 edward1 joined #gluster
10:52 hagarth joined #gluster
10:54 tryggvil_ joined #gluster
10:59 tryggvil joined #gluster
11:07 kkeithley1 joined #gluster
11:07 kkeithley1 left #gluster
11:08 kkeithley1 joined #gluster
11:11 zoldar what is the correct way to safely stop storage node? when try to stop it using init script, the glusterfs and glusterfsd processes hang around regardless.
11:14 zoldar is "gluster peer detach [server]" the right way to do that? Provided that I want to easily/promptly bring it back?
11:16 hagarth joined #gluster
11:18 kkeithley1 no, that dismantles your volume
11:19 kkeithley1 `gluster volume stop $volname` is the correct way to stop a volume.
11:20 zoldar kkeithley_wfh: but I want to temporarily disable one of the nodes, not the whole volume
11:21 kkeithley_wfh Then just kill the glusterfs and glusterfsd processes
11:21 kkeithley_wfh that's what I would do
11:22 kkeithley_wfh make sure you kill the correct ones, if you have more than one brick per node
11:23 puebele1 joined #gluster
11:24 zoldar kkeithley_wfh: that's what I did, but I've ended up with corrupted fs, of course I'm not 100% positive that this was the cause
11:25 ramkrsna joined #gluster
11:25 ramkrsna joined #gluster
11:26 mdarade joined #gluster
11:56 balunasj joined #gluster
12:55 ondergetekende joined #gluster
13:06 aliguori joined #gluster
13:10 tryggvil joined #gluster
13:13 manik joined #gluster
13:23 guigui1 left #gluster
13:29 puebele joined #gluster
13:30 Nr18_ joined #gluster
13:46 andreask joined #gluster
14:02 chouchins joined #gluster
14:02 Nr18 joined #gluster
14:06 hagarth joined #gluster
14:14 stopbit joined #gluster
14:14 sshaaf joined #gluster
14:16 puebele3 joined #gluster
14:32 wushudoin joined #gluster
14:32 wushudoin| joined #gluster
14:33 * johnmark fills head with Swift app ideas
14:33 glusterbot New news from resolvedglusterbugs: [Bug 867263] gluster peer probe appears in peer info <https://bugzilla.redhat.com/show_bug.cgi?id=867263>
14:38 sunus joined #gluster
14:45 lowtax2 joined #gluster
14:46 lowtax2 where is the on disk encryption documentation?
14:47 lowtax2 hrm
14:48 jdarcy lowtax2: It's part of HekaFS, so either http://www.hekafs.org or the raw doc source is at http://git.fedorahosted.org​/cgit/CloudFS.git/tree/doc
14:48 glusterbot Title: HekaFS (at www.hekafs.org)
14:49 jdarcy lowtax2: Getting that into GlusterFS itself is part of the plan and on my to-do list, but it's a long to-do list.
14:49 lowtax2 jdarcy: ok
14:50 lowtax2 is it better to do hekafs than like cryptfs with glusterfs on top?
14:50 jdarcy lowtax2: Depends on where you want to set your security perimeter.  If you're OK with the server having keys, then you can just use ecryptfs/LUKS/whatever and GlusterFS doesn't even need to know about it.
14:51 lowtax2 ok
14:51 jdarcy lowtax2: The HekaFS encryption was designed for the specific case of *not* trusting servers (e.g. in a shared-hosting/as-a-service environment) so keys are only on clients.
14:52 lowtax2 clients?
14:52 jdarcy Performance-wise, the server-side (e.g. ecryptfs) solution is going to be much better.
14:52 lowtax2 so it isnt compatible with nfs?
14:52 duerF joined #gluster
14:53 jdarcy lowtax2: There is no way to express the information needed for proper encryption through the NFS protocol, so no, it isn't really compatible.  The HekaFS encryption is only for native protocol.
14:53 lowtax2 ok
14:53 lowtax2 thanks
14:53 lowtax2 so ecryptfs is what you recommend
14:54 jdarcy I don't actually have an opinion on that myself, but ecryptfs seems reasonably well regarded by people who know more than I do.
14:56 ika2810 joined #gluster
14:56 Technicool joined #gluster
14:57 pdurbin jdarcy: dunno if you noticed this tweet from one of the centos guys: "What is the lightest weight, simplest, will mostly work shared storage solution that isn't NFS ?" -- https://twitter.com/kbsingh​/status/260088365351841792
14:57 glusterbot Title: Twitter / kbsingh: What is the lightest weight, ... (at twitter.com)
14:59 ika2810 joined #gluster
15:00 ramkrsna joined #gluster
15:00 ramkrsna joined #gluster
15:04 wushudoin| joined #gluster
15:06 ika2810 joined #gluster
15:07 sashko joined #gluster
15:15 jdarcy pdurbin: Thanks!  Looks like an interesting discussion has followed from that.
15:16 tc00per joined #gluster
15:17 tmirks joined #gluster
15:19 pdurbin jdarcy: sure. you're my goto guy for anything NFS now :)
15:20 pdurbin look at all the gluster fans :)
15:27 badone joined #gluster
15:27 pdurbin gluster is the new NFS :)
15:29 jdarcy We're the incumbent in our space, where "incumbent" means "favorite target"
15:32 johnmark pdurbin: heh heh :)
15:32 johnmark jdarcy: that's exactly right. Wear it with pride!
15:32 pdurbin heh
15:35 pdurbin it would be interesting to hear more of what karabir is looking for
15:36 pdurbin whoops. karanbir i mean :(
15:36 daMaestro joined #gluster
15:37 johnmark pdurbin: agreed
15:39 pdurbin johnmark: i sent him a link in ##infra-talk
15:39 johnmark pdurbin: oh good. I plan to meet him and RIP next Friday
15:39 johnmark for beeeeer
15:39 pdurbin mmm, beer
15:41 raghu joined #gluster
15:48 jdarcy johnmark: Where are you meeting RIP?
15:48 aliguori joined #gluster
15:59 ika2810 joined #gluster
16:03 bala1 joined #gluster
16:05 ika2810 left #gluster
16:09 akadaedalus joined #gluster
16:12 lh JoeJulian, ping
16:23 Tarok pong
16:24 johnmark jdarcy: London
16:24 johnmark jdarcy: I'm going there for Red Hat Dev Day
16:26 elyograg A question you may get tired of hearing: Are there any plans to add an auth secret to gluster, if for nothing else to prevent arbitrary nodes from being added with gluster peer probe?  Ideally it would also handle authentication of all connection initiations.  Also possible, but perhaps not desirable for performance reasons, it could encrypt all inter-server communication.
16:28 semiosis elyograg: the constraints on probing are... probe must be sent from a server in the pool to a server not in any pool.  if a server is already in a pool it will reject probes from servers outside its own pool.
16:29 semiosis so that limits how "arbitrarily" nodes can be added
16:30 semiosis elyograg: so you're proposing that an admin would have to also add a secret, say in some config file, before the new server will accept probes?  then only probes from servers with the same secret would be accepted?
16:30 Mo__ joined #gluster
16:30 lh johnmark, see PM please
16:32 elyograg semiosis: yes.  just thinking about other "cluster" software that I am familiar with, like heartbeat and corosync.  The existing setup does prevent unathorized things from happening.  I guess the only thing really left is reducing the likelihood of admin screwups in large environments with many people that have full admin access.
16:33 semiosis gotcha.  there was some work being done with ,,(hekafs) that's kinda related, tho idk if that exact feature was considered
16:33 glusterbot CloudFS is now HekaFS. See http://hekafs.org or https://fedoraproject.org/wiki/Features/CloudFS
16:33 elyograg semiosis: you can never prevent an admin with full access and an intent on doing harm, of course.
16:34 semiosis yeah
16:35 semiosis jdarcy: what ever happened to all that crypto stuff you were working on a year ago?  seems i've not heard much about it lately
16:35 kkeithley_wfh multi-tenat
16:35 semiosis kkeithley_wfh: yeah, that too... ?
16:35 kkeithley_wfh multi-tenant, encryption, and all that stuff from HekaFS is on the roadmap for 3.5 probably at this point
16:35 semiosis cool!
16:36 kkeithley_wfh Actually, I hope the wire encryption happens sooner.
16:36 rosco_ joined #gluster
16:36 semiosis kkeithley_wfh: +1, seems it would be real important with the multi-master georep & ,,(ponies)
16:36 glusterbot kkeithley_wfh: http://hekafs.org/index.php/​2011/10/all-that-and-a-pony/
16:37 semiosis even in a single-tenant setting
16:37 kkeithley_wfh yup
16:37 semiosis @roadmap
16:37 glusterbot semiosis: See http://gluster.org/community/doc​umentation/index.php/Planning34
16:37 jdarcy The wire encryption is already in there, but the management for it could use some UX love.
16:39 lh JoeJulian, you have mail
16:41 kkeithley_wfh right, encryption is in. I was too lazy to look and didn't want to say it was in and be wrong.
16:50 Nr18 joined #gluster
16:54 eightyeight joined #gluster
16:56 eightyeight so, the reason for the disconnect?
16:56 eightyeight i live migrated my irc vm from one hyper to the other, so i could reboot it
16:57 eightyeight rebooting it froze my internet connection. what's up with that?
16:57 eightyeight if the vm is on box #2, and i reboot box #1, it shouldn't affect the network for the vm
16:59 weebucket joined #gluster
16:59 crashmag joined #gluster
16:59 weebucket In glusterfs 3.3 how do you actually set these settings? -> http://gluster.org/community/documen​tation/index.php/Gluster_Translators
17:00 weebucket I want to disable stat-prefetch, read-ahead, and write-behind
17:00 weebucket and quick-read
17:00 weebucket and io-cache... ok I just want to disable all of the performance options =)
17:01 jdarcy gluster volume set $myvol performance.whatever off
17:02 weebucket so:: gluster volume set gv0 performance/stat-prefetch Off ?
17:02 ika2810 joined #gluster
17:05 jbrooks joined #gluster
17:05 jdarcy Except for dot, not slash.
17:06 jdarcy I do kind of wonder why you'd want to turn off those options, though.  Most people want more performance translators, not fewer.
17:06 ika2810 joined #gluster
17:07 jdarcy eightyeight: Are you replicating across boxes?
17:08 eightyeight jdarcy: yes
17:08 jdarcy eightyeight: So turning off one box should affect a VM running on the other, if that VM is using the replicated storage, right?
17:08 jdarcy Still shouldn't be a hang/freeze, of course.
17:09 weebucket @jdarcy running it on AWS with EBS, so EBS does all of those performance tuning options already, so gluster does not need to do them.
17:09 weebucket At least, that is my understand from this blog post: http://www.sirgroane.net/2010/03/t​uning-glusterfs-for-apache-on-ec2/
17:09 glusterbot Title: Ian Rogers » Tuning glusterfs for apache on EC2 (at www.sirgroane.net)
17:10 eightyeight jdarcy: it shouldn't affect network connectivity, no. the vm should remain online, without hiccup
17:10 eightyeight which makes me wonder if there was a bad arp cache
17:10 jdarcy weebucket: EBS doesn't/can't do a lot of those.  For example, it doesn't know data from metadata so it can't do stat-prefetch.  AFAIK it doesn't do write-behind for other reasons, though it might do read-ahead.
17:11 semiosis weebucket: it should be noted here that the article is about glusterfs 3.0, and we have come a long way since then... mainly volfiles should not be edited by hand anymore
17:11 semiosis let gluster cli generate them
17:11 semiosis defaults work fine on ebs
17:12 jdarcy weebucket: Even for optimizations that EBS can do, doing them higher up in the stack can still be more effective.  Why call down from FS to disk for data that the FS could cache itself?
17:14 jdarcy eightyeight: Is there any other cluster/heartbeat/failover software involved?
17:14 jdarcy eightyeight: GlusterFS doesn't change network configs, just uses what's there, so it seems highly unlikely that it's the cause of a network glitch like that.
17:15 eightyeight jdarcy: no. gluster is just replicating vm images on a zfs filesystem. that's it
17:15 jdarcy Oh, well, ZFS.  (Just kidding.)
17:16 jdarcy eightyeight: Do you know the nature of the network interruption?  Interface down, missing router, packets just disappear?
17:16 jdarcy Is one machine routing through the other? I know it seems unlikely, but you'd be surprised.
17:17 eightyeight it's strange
17:17 eightyeight i'm going to create a test vm, and follow the arp across migrations
17:18 eightyeight initially, i thought this might have something to do with gluster, but the more i look at it, it's either kvm or libvirt
17:18 eightyeight it could be the crappy netgear switch i'm connected to as well
17:18 jdarcy Those all sound plausible.  If you *do* find anything suggesting that GlusterFS is involved, could you please let us know so we can fix it?
17:19 eightyeight yes
17:20 jdarcy Excellent, thanks.
17:20 bulde1 joined #gluster
17:25 hagarth joined #gluster
17:30 y4m4 joined #gluster
17:30 Eco_ joined #gluster
17:34 JoeJulian :O
17:35 JoeJulian lh: interesting... It goes against one of my basic philosophies, but it does appeal to my vanity. ;)
17:35 lh JoeJulian, which philosophy is that?
17:35 lh vanity we can do :)
17:36 semiosis :O
17:36 hagarth :O
17:36 Eco_ :O
17:38 JoeJulian That books are out of date by the time they're printed. :)
17:39 JoeJulian Of course, so is the wiki, so I guess it's not *that* bad.
17:39 Eco_ :O
17:39 JoeJulian :O
17:39 lh JoeJulian, this gives you many opportunities for second editions
17:39 lh third editions
17:39 JoeJulian :)
17:39 * semiosis started writing a glusterfs book
17:39 lh see. vanity. works every time. :)
17:39 lh semiosis, where is it hiding?
17:39 JoeJulian lol
17:40 Eco_ i thought the glusterfs book was `git grep $term` ?
17:40 JoeJulian semiosis: We should collaborate. I'm not sure I'd be any good at it.
17:40 semiosis was half-way through a formal proposal to send to o'reilly when i realized i wasn't going to be able to commit to the time it would requires, and also exactly what JoeJulian said about being out of date as soon as it's published
17:40 Eco_ i can write all the hilarious captions and xkcd style cartoons
17:40 semiosis JoeJulian: ok
17:41 ika2810 left #gluster
17:42 semiosis as far as o'reilly books go the only format that seems like it could work would be an Essentials series edition
17:44 semiosis i was thinking three units. I. Glusterfs History, II. Gluterfs Theory, and III. Real world implementation best practices
17:45 semiosis part III would be the most time consuming for me because i'm only really familiar with one real world use-case, web media
17:45 JoeJulian If there's an outline, I can easily fill in topics. Coming up with the outline is always the hard part.
17:45 semiosis lots of people want to do UFO, Hadoop, VM serving
17:45 semiosis JoeJulian: i'll dig it up
17:45 * JoeJulian knows a lot more about UFO today than I did Friday.
17:45 pdurbin semiosis: wow, roman numerals even
17:46 elyograg when in rome ... or something.
17:46 semiosis pdurbin: aren't units done in roman?
17:46 JoeJulian The romans knew a lot about clustered storage.
17:46 pdurbin unit the first... nfs must die
17:46 semiosis hehe
17:46 maxiepax joined #gluster
17:47 semiosis lh: where are you in the process?  just thinking about it or do you actually have interest from a publisher?
17:47 JoeJulian I think you're thinking of eunuchs being roman...
17:48 Nr18 joined #gluster
17:49 pdurbin it's almost like they named a certain operating system after them
17:49 semiosis bbiab, lunch
17:49 maxiepax redhat recommend using SAS drives for gluster HPC storage, however, im wondering if anyone has any experience on if this is still valid when the amount of nodes increases radically? f.ex. 34 nodes, each with 12 discs and 1gb cache hardware raid.
17:49 lh semiosis, this is a random idea i removed from my brain yesterdayish
17:49 * lh has to go have lunch with the Red Hat Portland crew bbl
17:50 semiosis lh: cool, ok ttyl
18:07 semiosis so this book was on the top of my mind around the time of the redhat summit in june
18:07 semiosis and the #docathon was one result of it
18:13 neofob joined #gluster
18:17 jiqiren joined #gluster
18:23 nuttyhazel joined #gluster
18:23 nuttyhazel If I had gluster installed across two availability zones, how would it have handled amazon's failure yesterday?
18:25 gbrand_ joined #gluster
18:26 semiosis nuttyhazel: i've been there
18:26 semiosis the ec2 problem had to do with ebs volumes, so if you were using only instance-store (for os root & storage) you might not have been affected at all
18:27 semiosis but usually (and as I recommend) people do use ebs volumes for glusterfs
18:27 semiosis and in that case, if you'd had a volume replicated between two AZs *and* you lost ebs volumes to the incident...
18:28 semiosis there are a few possible failure scenarios, of which the most extreme would be that the root disk ebs volume of one of your glusterfs servers became degraded, and the whole server died or locked up
18:29 semiosis your volume should've continued running with the remaining server, after a brief timeout (ping-timeout) default of 42s
18:30 imcsk8 joined #gluster
18:31 johnmark @channelstats
18:31 glusterbot johnmark: On #gluster there have been 33979 messages, containing 1504517 characters, 249223 words, 982 smileys, and 147 frowns; 249 of those messages were ACTIONs. There have been 12036 joins, 392 parts, 11623 quits, 1 kick, 23 mode changes, and 4 topic changes. There are currently 162 users and the channel has peaked at 185 users.
18:31 nuttyhazel Ok cool, thank you, I would use replication across the availability zones with just two servers, using ebs as a backing store.
18:33 semiosis nuttyhazel: weird things happen in failures
18:39 nuttyhazel Yes, currently we are using just a single ebs volume, but would like to add gluster to keep yesterday from killing our services too. The odd thing is the ebs volumes would actually respond, but only after like 30 minutes of the process sitting in a D state.
18:40 Fabiom can I use autofs to mount volumes with glusterfs client ?
18:41 imcsk8 hello, i have a problem trying to mount a glusterfs volume on a remote computer, i get this error on glusterd log file: [2012-10-23 12:28:43.176085] W [rpcsvc.c:179:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 14398633 1)
18:43 jdarcy I don't know about this time, but the last major EBS incident I recall turned out to be contagious across AZs.
18:44 semiosis nuttyhazel: there is one issue that you ought to be aware of: bug 832609
18:44 semiosis glusterbot: i said, bug 832609
18:44 semiosis glusterbot: meh
18:44 glusterbot semiosis: I'm not happy about it either
18:44 semiosis https://bugzilla.redhat.com/show_bug.cgi?id=832609
18:44 glusterbot Bug 832609: urgent, high, ---, rabhat, ASSIGNED , Glusterfsd hangs if brick filesystem becomes unresponsive, causing all clients to lock up
18:45 semiosis bugzilla is sloooow right now
18:45 jdarcy http://pl.atyp.us/wordpress/ind​ex.php/2011/04/amazons-outage/ and http://pl.atyp.us/wordpress/index.php/20​11/04/more-fallout-from-the-aws-outage/ and http://pl.atyp.us/wordpress/index.p​hp/2011/04/amazons-own-post-mortem/
18:45 glusterbot Title: Canned Platypus » Blog Archive » More Fallout from the AWS Outage (at pl.atyp.us)
18:45 nuttyhazel semiosis: the ubuntu lauchpad ppa has the wrong package name in the description, should it not be glusterfs-server as glusterd does not exist as a package in that ppa?
18:45 semiosis nuttyhazel: well it's not "wrong" just different :P
18:45 semiosis but yeah i'm transitioning away from those to the standard package names, see the ubuntu-glusterfs-3.3 pps
18:45 semiosis ppa
18:46 nuttyhazel is that the one I should be using instead?
18:47 semiosis nuttyhazel: to make a long bug report short, basically if an ebs backed brick becomes unresponsive your clients may lock up waiting for ops to complete
18:48 semiosis shutting down the server lets clients continue, there may be other solutions as well but i've not been able to test that too much yet
18:48 semiosis nuttyhazel: probably yes, the ubuntu- one
18:50 nuttyhazel semiosis: i figured the ping-timeout would take care of that bug?
18:51 semiosis if it did, it would not be a bug :)
18:52 semiosis bug is that nothing takes care of it, everything just hangs
18:52 nuttyhazel so ping-timeout is not client side then?
18:52 tqrst joined #gluster
18:53 semiosis yeah but this scenario isn't a ping timeout
18:53 semiosis ping timeout is when the server/brick drops off the network
18:53 nuttyhazel ah... ok I get it.
18:53 semiosis in this case it's on the network, but not servicing iops
18:53 nuttyhazel right
18:54 semiosis frame timeout might get it, but that's like 30 min, which is basically "for ever" imho
18:54 semiosis and i'm not sure if frame timeout even did kick in
18:55 aliguori joined #gluster
18:55 nuttyhazel could frame timeout be set to something like 1 minute?
18:55 semiosis i suppose so but idk the consequences
19:18 Psi-Jack Okay.. So I'm trying to auto-mount  glusterfs 3.3 volume, using mount -t glusterfs glusterfs01:/gv0 /data, and it's failing with: error while getting volume file from server glusterfs01
19:21 JoeJulian @ports
19:21 glusterbot JoeJulian: glusterd's management port is 24007/tcp and 24008/tcp if you use rdma. Bricks (glusterfsd) use 24009 & up. (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111.
19:21 JoeJulian Psi-Jack: Maybe iptables?
19:21 JoeJulian Otherwise...
19:21 Psi-Jack Negative... But.. Hmmm. Interesting thought... Maybe firewall between the two points still. Looking into it. ;)
19:22 JoeJulian You should be able to telnet from the client to 24007 and have it connect.
19:22 Psi-Jack Looks like 24009 , actually. What's 24007 about?
19:23 JoeJulian 24007 is the management port. That's where the client retrieves the volume configuration from. If it's not listening, glusterd isn't running.
19:23 Psi-Jack Well, telnet to both ports don't give me active refusals.
19:23 Psi-Jack Gotcha. Telnet was successful, but don't know what to tell it to get a response back. :)
19:24 JoeJulian That's ok. We've proven it's not firewalled.
19:24 semiosis Psi-Jack: why auto-mount?
19:24 semiosis 9/10 times people use automount because their fstab mount didnt work at boot time
19:24 JoeJulian ~pasteinfo | Psi-Jack
19:24 glusterbot Psi-Jack: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
19:25 semiosis s/people use/people want to use/
19:25 glusterbot What semiosis meant to say was: 9/10 times people want to use automount because their fstab mount didnt work at boot time
19:25 Psi-Jack semiosis: I'm just trying to mount it, period.
19:25 semiosis Psi-Jack: what distro?
19:25 Psi-Jack Ubuntu 10.04, from your PPA. ;)
19:26 semiosis 10.04?  you know 12.04 has been out for a while now
19:26 Psi-Jack http://dpaste.org/gLL8x/
19:26 glusterbot Title: dpaste.de: Snippet #211660 (at dpaste.org)
19:26 semiosis thought about upgrading?
19:26 Psi-Jack semiosis: Yes, I know. And yes, we are. ;)
19:26 semiosis sadly, i can't support 10.04
19:27 Psi-Jack Eh? Sadly? It's the same thing, is it not? ;)
19:27 semiosis the boot-time mount stuff was inadequate in that release, all i had was a kludgy hacked solution
19:27 Psi-Jack Ahhhh
19:27 Psi-Jack The only thing I'm trying to do is mount, at this point.
19:27 mrkvm joined #gluster
19:27 semiosis are you trying to mount from localhost at boot time?
19:27 Psi-Jack And it's failing to do that, from the glusterfs-client boxes.
19:27 semiosis or from a remote client?
19:27 Psi-Jack semiosis: No, a remote server.
19:28 Psi-Jack localhost is working just fine.
19:28 semiosis ah, ok, then most likely a simple configuration issue
19:28 JoeJulian was glusterfs01 obfuscation or a cname?
19:28 Psi-Jack Yeah, far as any issues go AFTER this, I'll be able to support and engineer the workarounds needed. ;)
19:28 Psi-Jack JoeJulian: obfuscation, which is revealed as actual from the paste.
19:29 semiosis Psi-Jack: please pastie your client log file, or the last ~20 lines from it
19:30 Psi-Jack Oh, sheash, and yet, from mfweb01, it worked, mfweb02, it didn't.. Fricken crazy. ;)
19:31 semiosis sorry, to clarify, i can't support the upstartified stuff for mounting from localhost at boot time on ubuntu older than precise... it's too hacky
19:31 semiosis but this is different, should be an easy fix once we see the problem in the logs
19:31 Psi-Jack semiosis: Ahhhhh. ;)
19:31 Psi-Jack Right. ;)
19:33 Psi-Jack http://dpaste.org/rITSk/  is the mount attempt log
19:33 glusterbot Title: dpaste.de: Snippet #211661 (at dpaste.org)
19:34 Psi-Jack Which is strange, because, that shows connection refused, yet, I can telnet just fine to both the management port, and the volume port.
19:37 Psi-Jack D'oh...
19:37 semiosis well, what was it?
19:37 Psi-Jack Had glusterfs 3.0 installed on the mfweb02. ;)
19:38 Psi-Jack From Ubuntu Lucid's own repos.
19:38 semiosis that's a D'oh indeed
19:38 Psi-Jack Yeaaaah... Heh. So now it works great! :)
19:39 semiosis rock & roll
19:40 Psi-Jack Cool, and I can setup the backupvolfile-server and everything in 3.3.0. Very nice. ;)
19:41 semiosis ~rrdns | Psi-Jack
19:41 glusterbot Psi-Jack: You can use rrdns to allow failover for mounting your volume. See Joe's tutorial: http://edwyseguru.wordpress.com/2012/01/09/usin​g-rrdns-to-allow-mount-failover-with-glusterfs/
19:41 semiosis i think rr-dns is recommended over the backup volfile server
19:41 semiosis JoeJulian: can you confirm?
19:41 Psi-Jack semiosis: I decline. I'd rather use CRM management VIPs for this to guarantee it. ;)
19:42 Psi-Jack That way the CRM handles where the VIP is on a guaranteed working server, and it can always act as backup, or primary even. ;)
19:42 steven_ joined #gluster
19:44 steven_ question: why would gluster peer status show all the peers connected, but volume status shows No PIDs and N for online for every brick on that system?
19:45 Psi-Jack semiosis: The main problem you discovered was with 10.04's localhost mounting, or remote-client mounting, or both?
19:48 semiosis steven_: just a wild guess... because the volume is not started
19:49 raghu joined #gluster
19:49 steven_ "Volume mail already started"
19:49 semiosis Psi-Jack: mounting glusterfs volumes from localhost at boot time.  problem is that mounts are executed before the local glusterd is running
19:50 steven_ what happened is I had to replace all the drives in a server, remounted everything, and restarted gluster, but none of the new drives are coming up
19:50 semiosis steven_: check brick logs on that server... there should be messages about why the glusterfsd ,,(processes) could not start
19:50 glusterbot steven_: the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more
19:50 glusterbot information.
19:51 steven_ semiosis thank you
19:52 steven_ then just run a heal all right?
19:55 chouchin_ joined #gluster
20:09 Daxxial_1 joined #gluster
20:31 badone joined #gluster
20:34 semiosis steven_: you're welcome
20:34 deckid joined #gluster
20:35 semiosis yes, heal sounds like a good idea
20:35 Psi-Jack Hmm.
20:36 Psi-Jack semiosis: Gotcha. I think I already know good ways to work around that issue. Done it in the past anyway. ;)
20:36 Psi-Jack So, GlusterFS 3.3.x can actually, easily, provide, from the same physical server clusters, different volumes to different groups of servers, replicated and distributed?
20:37 steven_ yeah heal didnt seem to do anything, the df -k output on the new drives is still the same
20:39 steven_ is there a heal log? I dont see it
20:40 semiosis client log?
20:40 Psi-Jack Ahh, the upstart jobs are experimental, eh?
20:40 semiosis Psi-Jack: not really anymore
20:40 Psi-Jack No? The launchpad stuff needs to be updated, then. ;)
20:40 semiosis where do you see that?  i'll remove the experimental note :)
20:41 Psi-Jack https://launchpad.net/~semiosis/​+archive/upstarted-glusterfs-3.3
20:41 glusterbot Title: upstarted-glusterfs-3.3 : semiosis (at launchpad.net)
20:41 semiosis use the ubuntu-glusterfs-3.3 now
20:41 semiosis i'm going to phase out the other PPAs
20:41 semiosis ubuntu-glusterfs-3.3 uses the standard package structure from the official repos, and will probably be merged into ubuntu universe at one point
20:42 steven_ perhaps this has something to do with it [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket" then [cli-rpc-ops.c:5904:gf_cli3_1_heal_volume_cbk] 0-cli: Received resp to heal volume
20:42 steven_ thats in the cli log, otherwise it doesn't really talk about healing at all in logs
20:42 Psi-Jack semiosis: Cool. Yeah, at home I'm trying out GlusterFS 3.3 on Ubuntu 12.04. ;)
20:43 semiosis great, then yeah use the ubuntu-glusterfs-3.3 ppa and please let me know how it goes
20:47 Psi-Jack semiosis: Already checking it out. Left work from GlusterFS 3.3 PPA for 10.04, to come home and try GlusterFS 3.3 PPA for 12.04 LOL
20:47 Psi-Jack semiosis: Which, BTW, as I'm working on the 10.04 stuff, if I find any viable solutions for 10.04's issues you mentioned, would you like input on that? ;)
20:48 semiosis Psi-Jack: sure i'd welcome input :)
20:48 semiosis but beware the upstart... here be dragons
20:49 Psi-Jack Excellent, and glad to hear. I don't know your skill level with Ubuntu stuff, or Linux, personally, but I've got 21 years experience in Linux alone, plus several years experience with Ubuntu specifics. :)
20:49 Psi-Jack Including knowing upstart quite in-depth.
20:49 semiosis Psi-Jack: sweet!
20:49 Psi-Jack So, I /will/ find a solution.
20:50 semiosis great!  and i'd be happy to answer any questions to catch you up to speed
20:50 Psi-Jack Gotta love work sometimes. Playground at times, especially for us engineers. ;)
20:50 Psi-Jack Hehe.
20:50 Psi-Jack semiosis: When I get the chance to look further into it, tomorrow, I'll probably have some questions.
20:51 semiosis ok
20:51 deckid joined #gluster
20:51 semiosis the key points re: upstart is that ubuntu's flavor of upstart uses a program called mountall, and our goal is to intercept & block the events produced/consumed by mountall
20:52 Psi-Jack We're already talking about updating to 12.04, but we kinda can't just yet. The main reason I'm building out this new cluster in the first place is so we can properly scale out better as it is. Once it's all setup, we can start working on upgrading Ubuntu as well.
20:52 semiosis (and by block i mean delay)
20:52 * Psi-Jack nods.
20:52 Psi-Jack Yeah. That wonderful event based init system.. craziest design ever...
20:53 Psi-Jack I remember ALL of the annoyances I've had with some of it's invaluable PITAs. Such as bringing up ethX or bringing it down, it would fire off the "networking" event, which for LVS, caused everything to shut down and restart.
20:53 Psi-Jack Fun times. :D
20:54 Psi-Jack Some of Ubuntu's upstart event naming conventions were just all wrong... They should've included -start and -stop type events, not just /a/ generic all-encompasing event, which is part of it's design flaws. :)
20:55 semiosis not familiar with that, i thought there were starting/started/stopping/stopped events
20:55 semiosis but they're all one???
20:55 Psi-Jack Upstart sends "events" broadcasted, per-se. ;)
20:55 Psi-Jack Some of them have their own custom extra event they trigger directly from initctl.
20:56 Psi-Jack But, in the case of glusterfsd and localhost mounting, I'm betting we have to examine mountall, networking, and glusterfsd, before attempting to mount glusterfs-based mounts.
20:58 Psi-Jack And of course, insure that the glusterfs mounts have _netdev attached. But, really, an event fired off on glusterfsd to mount glusterfs based mounts should help resolve the problem. At just first thoughts. :)
20:58 semiosis ubuntu oneiric introduced a task called wait-for-state (/etc/init/wait-for-state.conf) that we use to block event "mounting TYPE=glusterfs" until event "started glusterd"
20:59 * Psi-Jack nods.
20:59 semiosis you can see that embarrasingly simple trick in /etc/init/mounting-glusterfs.conf
20:59 Psi-Jack A very important improvement to upstart. ;)
20:59 semiosis that solves the problem in most cases, but i've had one report of a bridge interface causing issues there... let me find the logs
21:00 hattenator joined #gluster
21:00 Psi-Jack Ahhhh, bridges, yeaah.. ;)
21:01 semiosis conversation starts around here: http://irclog.perlgeek.de/g​luster/2012-10-17#i_6072035
21:01 glusterbot Title: IRC log for #gluster, 2012-10-17 (at irclog.perlgeek.de)
21:03 semiosis so the issue then with 10.04 was that wait-for-state did not exist
21:03 Psi-Jack semiosis: Yeah, which means you have to use an alternative method for 10.04. :)
21:04 semiosis so i put the following in the glusterd upstart job: start on (local-filesystems and net-device-up IFACE=lo and net-device-up IFACE=eth0) or (mounting TYPE=glusterfs)
21:04 semiosis not even sure if the 'start on runlevel...' syntax was available in 10.04
21:04 Psi-Jack It is.
21:04 semiosis oh ok
21:05 Psi-Jack hehe
21:05 semiosis @forget ppa
21:05 glusterbot semiosis: The operation succeeded.
21:06 semiosis @learn ppa as The official glusterfs 3.3 packages for Ubuntu are available here: https://launchpad.net/~semiosis​/+archive/ubuntu-glusterfs-3.3
21:06 glusterbot semiosis: The operation succeeded.
21:06 Psi-Jack Heh, interesting. glusterfs-client doesn't even use upstart stuff at all, I see.
21:06 semiosis right
21:07 semiosis only if glusterfs-server is installed do we care about blocking glusterfs client mounts, and even then we really only care when they're from localhost, but it's not possible to determine which those are
21:08 semiosis since localhost could appear as any hostname, or 127.0.0.1, or an IP on any local interface
21:08 Psi-Jack Well, technically, in a way, it /is/ possible.. But you have to parse the fstab file. ;)
21:08 semiosis more than just fstab parsing
21:08 Psi-Jack Well, getent ns <name> parsing as well.
21:09 semiosis fstab parsing + name resolution + interface addresses :)
21:09 semiosis messy business
21:09 Psi-Jack yeah, but you can do all tge programming for that in 5 minutes, right? ;)
21:10 semiosis yes but i'll only do it in haskell or erlang... because i've been meaning to learn one of those :P
21:10 Psi-Jack haha
21:10 semiosis or maybe node.js
21:10 semiosis you know, something useful
21:11 hattenator node.js is easy enough, but I'd be incredibly impressed if you learned erlang in 5 minutes.
21:11 Psi-Jack OKay, so I see glusterd.conf, and mounting-glusterfs.conf in 12.04. And only glusterd.conf in 10.04 :)
21:11 semiosis yep, with the funky start on stuff
21:11 semiosis without wait-for-state we do the intercept & block in the glusterd job itself
21:12 semiosis the sleep is even required because glusterd returns before it's really ready
21:12 Psi-Jack Yep. :)
21:12 Psi-Jack Cool.. So I'll see what I can come up with tomorrow on that. I think I should have something by COB tomorrow. ;)
21:13 semiosis thanks!  awesome :)
21:15 Psi-Jack Interesting that in 12.04 you used mounting-glusterfs, instead of mounted-glusterfs. ;)
21:15 semiosis why's that interesting?
21:15 Psi-Jack Standard convention is actually mounted-<fstype>.conf
21:16 semiosis heh, the ubuntu devs never mentioned that
21:16 Psi-Jack because upstart triggers the 'mounted-<fstype>' upon completion.
21:16 Psi-Jack Each one, once they finish starting, will fire off an event for each upstart process. ;)
21:17 semiosis i'm looking at a few mounted-*.conf files on my 12.04 workstation and they all (so far) seem to start on mounted
21:17 semiosis whereas mine starts on mounting
21:18 semiosis yep, all of them
21:19 semiosis u sure about that convention? :)
21:20 raghu joined #gluster
21:21 Psi-Jack Yep.
21:21 Psi-Jack Cause even in 10.04, mounted-varrun.conf has: start on mounted MOUNTPOINT=/var/run TYPE=tmpfs
21:22 Psi-Jack Note the TYPE=tmpfs ;)
21:22 semiosis ok
21:23 Psi-Jack hehe. I already have a few ideas of how to resolve most of the problem with localhost mounting, and really, if you're localhost mounting, you should be pointing your glusterfs mountpoint to localhost:/volname anyway, not servername:/volname
21:23 Psi-Jack I'm assuming glusterd binds to 0.0.0.0
21:23 Psi-Jack Which it does.
21:25 Psi-Jack So, in 10.04 you can simply use: start on runlevel [2345]   for glusterd.conf, exec the glusterd as normal. And not have the post-start stuff at all, create a mounted-glusterfs.conf separately, to handle the actual mounting of the filesystems.
21:26 semiosis sounds good, what would you put in that mounted-glusterfs.conf?
21:26 semiosis could you pastie a sketch/example?
21:27 Psi-Jack Will need to test this out, but basically the same thing as what you have in 12.04's.. start on mounting TYPE=glusterfs    You /might/ need to consider a few more, since there is no wait-for-state, but accounting for localhost only.... glusterfsd should already be up and running properly by the time mountall-net hits.
21:27 Psi-Jack because, mountall-net starts on net-device-up.
21:28 aliguori joined #gluster
21:29 semiosis hmm, i'm not so confident in that "should" :)
21:31 Psi-Jack To make sure, start on (mounting TYPE=glusterfs and net-device-up IFACE=lo)
21:31 Psi-Jack Those are the only two you should be truly worried about with localhost based mounting. ;)
21:33 Psi-Jack heh, funny.. wait-for-state is just another upstart job. :)
21:36 semiosis yeah!  it's very clever
21:36 Psi-Jack hewh
21:37 Psi-Jack Technically, you /could/ re-use that in a wait-for-glusterfsd job. :)
21:37 Psi-Jack Err, wait-for-glusterd
21:38 semiosis i suppose
21:38 semiosis lets say "one" could ;)
21:39 Psi-Jack Heh, though looking at your mounting-glusterfs.conf, I'm wondering how exactly that it even does the mounts at all.
21:39 semiosis well, it doesn't do the mounts
21:39 semiosis it just blocks mountall from doing them for a bit
21:39 semiosis thats my understanding anyway
21:39 Psi-Jack Ahhhhh.. hence, mounting-* convention works.
21:40 semiosis SpamapS guided me toward this solution, so it should be good
21:40 * Psi-Jack nods.
21:41 y4m4 joined #gluster
21:41 semiosis and also accepted it into universe for precise
21:41 semiosis in the 3.2.5 package
21:42 semiosis now i just need to re-learn bzr so i can request the merge for 3.2.7 or 3.3.1
21:42 sashko guys anyone using ext4 with single volumes of over 16TB on cents or rhel?
21:42 JoeJulian sashko: Not if they're lucky: ,,(ext4)
21:42 glusterbot sashko: Read about the ext4 problem at http://joejulian.name/blog/gluste​rfs-bit-by-ext4-structure-change/
21:53 Psi-Jack heh
21:53 sashko JoeJulian: i'm actually asking for usage outside of gluster, just on a backup machine
21:54 semiosis i thought ext4 was limited to 16T
21:54 sashko people recommending xfs, but i don't trust xfs as much as ext
21:54 Psi-Jack semiosis: Yeah, I should have a solution tomorrow for sure. The main focus points are to insure localhost glusterfs mounting works, primarily. With THAT working, anything else needed should be managed by the admin anyway, for other particular interfaces. But lo is the important one.
21:54 sashko semiosis: no, actually the limit is 1EiB
21:54 sashko semiosis: the limit comes from e2fsprogs not being able to create a volume larger than 16TB for now
21:54 semiosis well how else are you going to create one then?
21:55 sashko well rhel and centos e2fsprogs, the actual source version does
21:55 sashko so version 1.42 does
21:55 sashko 1.41 doesn't
21:55 semiosis ah cool
21:55 semiosis Psi-Jack: cool, i'm looking forward to it :)
21:55 sashko but they don't recommend compiling 1.42 yourself against a kernel from rhel or centos, there are some kernel dependencies
21:55 sashko so i'm just stuck either waiting for back port  or someone found a solution :)
21:56 elyograg sashko: I have personally seen corruption problems in hardware crashes on database servers using xfs.  that was a long time ago.  it's my understanding that if you mount with barriers, or else you use a controller with battery-backed cache memory, that this doesn't happen.
21:56 Psi-Jack semiosis: Apparently, it's also Ubuntu's alternative mountall that's actually emitting everything, too, emitting mounting-<fstype> and all. But yet, very poorly (and not really) documented at all.
21:56 sashko elyograg: that's what i've heard, however not about barriers or bbu, might give that a try
21:57 sashko elyograg: would barriers even matter if you have write cache turned off?
21:57 Psi-Jack semiosis: Ubuntu custom made mountall, you can verify with mountall --help (report bugs to ubuntu-devel@lists.ubuntu.com), where-as the manpage for mountall is /completely/ unrelated. ;)
21:58 JoeJulian sashko: When we were having dinner at Red Hat summit, I was sitting across from Ric Wheeler, Senior Manager & Architect RHEL Kernel File System Group for Red Hat. He was pretty strongly suggesting xfs over ext4 even before that issue came up.
21:58 semiosis Psi-Jack: my fave part is this, from the mountall(8) man page: This is a temporary tool until init(8) itself gains the necessary flexibility to perform this processing; you should not rely on its behaviour.
21:58 elyograg sashko: i don't know.  typically, in situations where there is no controller cache, the OS doesn't do write caching but the physical drive does.  if you made sure you turned off the write cache on the drives themselves, you'd probably be ok.
21:58 Psi-Jack heh
21:58 Psi-Jack semiosis: Yeah.. But that's for GNU's mountall. Not Ubuntu's own custom one. ;)
21:59 semiosis wow, never knew that
21:59 semiosis what is gnu mountall?
21:59 Psi-Jack semiosis: mountall from Ubuntu is what actually emits everything for upstart, including TYPE=<fstype> as part of the emit. ;)
21:59 sashko elyograg: yes, but i don't think barriers forces the disk to write to disk, I believe some disks still ignore it?
21:59 sashko it's all a mess :)
21:59 sashko JoeJulian: that's good to know!
21:59 semiosis Psi-Jack: that man page says written by remnant so i figured it was talking about the ubuntu/upstart/mountall thing
22:00 Psi-Jack semiosis: What I'll end up doing for this alternative to wait-for-state, is have a pre-start in mounting-glusterfs do the same thing, but self contained.
22:00 semiosis Psi-Jack: nice
22:00 Psi-Jack Should be fine. ;)
22:01 elyograg sashko: in the past I saw even the most basic systems complain about barriers not being supported when mounting xfs.  although there may be exceptions, I think that linux is pretty good about knowing when a system/controller/disk combination will ignore barriers.  i could be complete wrong, though.
22:01 Psi-Jack Just got to block differently, and without relying on anything else.
22:02 sashko elyograg: alright, thanks!
22:02 Psi-Jack Because, what's technically happening is mountall, runs as a daemon, and executes "initctl emit mounting-<fstype> TYPE=<fstype> MOUNTPOINT=<mountpoint>
22:03 Psi-Jack semiosis: And while mounting-fstype is "running" it won't actually try mounting, until /after/ mounting-fstype is stopped again, or something to that effect, hence "blocking"
22:04 Psi-Jack Clever, in a way. :)
22:04 semiosis right, too clever
22:09 Fabiom joined #gluster
22:10 Psi-Jack Would have to look at Ubuntu's own mountall source to verify all that I said, but it's something along those lines, I'm sure. ;)
22:11 Psi-Jack Because, remember what I said. Standard convention is, mounted-fstype... Well, their mountall must be starting a pseudo event called mounted-fstype, waiting for mounting-fstype to complete.
22:12 semiosis i dont follow
22:14 Psi-Jack Actually, no, not even that. It fires off mounting-fstype, sees if anything triggers from that, if so, it waits, mounts, then emits mounted-fstype for post-mounting stuff to run.
22:16 Psi-Jack So, the mounting-glusterfs pre-start will just have to watch for glusterfsd to be running. :)
22:17 semiosis ...which is what wait-for-state does
22:17 semiosis sounds like a good plan
22:17 * Psi-Jack nods.
22:17 mrkvm joined #gluster
22:17 Psi-Jack Heh, yeah. Fun stuff :)
22:18 Psi-Jack Told ya I could come up with a solution. ;)
22:19 semiosis sure but what again was the problem?  i mean, the localhost mounting on ubu 10.04 did work the way i had it for most people
22:19 semiosis did it not work for you?
22:19 Psi-Jack Haven't tried. LOL
22:19 semiosis probably doesnt need to change then
22:20 Psi-Jack Ideally it shoul dbe a lot like 12.04's was done, though,m if it works as I think it does (else, SpamapS wouldn't have suggested it), which would reduce the problem, and be properly handled. :)
22:21 semiosis i suppose so
22:22 Psi-Jack Because, glusterd and the mounts are technically different from each other.
22:22 semiosis they are
22:22 Psi-Jack Not everyone's going to localhost mount, so there's no reason to make glusterd wait on anything, itself.
22:23 semiosis to be clear, none of my solutions make glusterd wait, they all make the mounts wait for glusterd, in one way or another
22:23 semiosis because i agree there's no reason to make glusterd wait on anything
22:24 Psi-Jack That's where you're wrong. :)
22:24 semiosis ?
22:25 Psi-Jack in 10.04's glusterd.conf, you're making it wait for local-filesystems, net-device-up, iface=lo, eth0, etc. /or/ mounting to occur.
22:25 Psi-Jack So, technically, you're making it wait. :)
22:25 semiosis ok yeah you got me fine whatever
22:26 Psi-Jack Hehehe
22:26 semiosis everyone wants local-filesystems & networking up even those who don't want do mount glusterfs volumes locally on the server
22:26 Psi-Jack Worst case, you'd make it wait for local-filesystems, simply because it does kind of rely on that,.. NIC interfaces, maybe, but if I understand it correctly, glusterd's will re-establish connections to the other nodes once it has the ability to connect.
22:27 semiosis not waiting on anything unnecessary, mr pedantic
22:27 semiosis :P
22:28 Psi-Jack Hehe, but yes.. I am definitely pedantic. It's why I'm in the business of being a Linux Systems Engineer. :D
22:28 semiosis clearly
22:29 tryggvil joined #gluster
22:33 Psi-Jack Drats. :(
22:33 Psi-Jack Now I'm annoyed. haha! I have the option to go to an Ubisoft Hooters event, where Hooters girls will be involved in showing off Just Dance 4... Or waiting for my fiance to get on so we can talk this week. :/
22:38 atrius anyone using geo-replication under Ubuntu?
22:49 koodough joined #gluster
22:55 koodough joined #gluster
23:24 Ryan_Lane joined #gluster
23:24 Ryan_Lane I'm running in hp cloud, and my instances were destroyed during maintenance work
23:24 Ryan_Lane I had gluster running in them
23:24 Ryan_Lane the volumes were on block storage
23:24 Ryan_Lane but the glusterfs volume configuration is gone
23:24 Ryan_Lane how can I add the disk into a volume?
23:40 Technicool Ryan_Lane, did you lose /var/log as well?
23:41 Ryan_Lane lost everything
23:41 Ryan_Lane except the block device
23:41 Technicool if you happened to know for sure the order the bricks were in, you can recreate the volume as it was before
23:41 Ryan_Lane if I try that isn't it going to tell me it was already part of a volume?
23:41 Ryan_Lane I know the order of the bricks
23:42 JoeJulian Yuck. Theoretically you could create a new trusted pool, then create a new volume with all your bricks in the same order as the first time and it should work.
23:42 Ryan_Lane ah ok
23:42 Technicool but the only way i know offhand to do that of course is with the vol file or piecing it together from /var/log/glusterfs*
23:42 Technicool ^^
23:42 JoeJulian Oh, right... already part of a volume.
23:42 Ryan_Lane welcome to the wonderful world of the cloud
23:42 JoeJulian well...
23:42 JoeJulian hmm, let me check something... brb.
23:42 Ryan_Lane oh, did I mention that the peer names and the peer IPs are different too?
23:42 Technicool any reason not to just setfattr -X ?
23:43 Technicool ryan, that shouldnt matter
23:43 Ryan_Lane ah ok
23:43 Ryan_Lane thankfully I had backups
23:43 Technicool as long as the order of the bricks is presented precisely as it was
23:43 Ryan_Lane and they were recent
23:43 Ryan_Lane but I needed to know this for the next time this happens
23:43 Technicool get out of here, Eco_
23:43 Ryan_Lane I'm sure there will be one
23:44 johnmark Technicool: get out of here?
23:44 johnmark Ryan_Lane: greets
23:44 Technicool in the absolute worst case, you can always rsync the data back into a pool
23:44 Technicool johnmark, happy to comply  ;)
23:45 johnmark haha :)
23:45 johnmark Technicool: well I'm getting out of here. it's late and need to get home
23:45 Technicool dude, you aren't home
23:45 Technicool you are pulling an Eco_
23:45 Technicool go home
23:45 Ryan_Lane Technicool: that's what I'm doing
23:46 johnmark Ryan_Lane: pulling an Eco? sounds violent ;)
23:46 Ryan_Lane heh
23:46 JoeJulian Technicool: I have an idea that might be able to avoid rebuilding the .glusterfs tree
23:46 Ryan_Lane no, restoring from backups
23:46 Technicool JoeJulian, you are speaking poetry, do go on
23:47 hattenator Are we assuming the volume is using distribute?  If not, I think reconstruction should be much simpler
23:52 JoeJulian Ok, this worked in my little test. Create the volume without the block storage being mounted in the same order it was before. Start and stop the volume. "getfattr -n trusted.glusterfs.volume-id -e hex $brick" save for later. Mount the block storage. "setfattr -n trusted.glusterfs.volume-id -v $savedid $brick" for each brick. Start the volume again.
23:53 JoeJulian I hope it works. Gotta run and catch a train home. ttfn.
23:53 Technicool later
23:53 Technicool the only part i am missing is why the volumes don't need to be in the same order as before
23:54 Technicool at least, the replica pairs would have to end up the same or backwards
23:55 semiosis i would create a new volume and copy the data in through the client from the bricks you have
23:55 semiosis pretty sure that's the safe way to do it
23:55 Technicool Ryan_Lane, did you clear the xattr's on the bricks before re-creating the volume?  if you have a small amount of data (< 2TB or so) and a decent connection then rsync should be fast enough
23:55 Ryan_Lane I just reformatted the disks
23:56 Technicool ah, well
23:56 Technicool my photorec skills aren't good enough to help you there
23:56 Ryan_Lane it's a really small amount of data
23:56 Ryan_Lane like 200MB or so
23:57 Technicool semiosis, if the amount of data was large enough to cause the rsync to take a significant amount of time, it's worth a shot since at worst you can do the rsync if it doesn't work
23:58 Technicool esp. if there is xattr data to be preserved that is not gluster specific, e.g. from a home grown app

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary