Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2015-01-06

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 DurzoAU oooh.. found out why semiosis isnt rolling debs anymore.. theres a new ppa from the "gluster team" which is doing them since 3.5.3
00:01 DurzoAU which is really just semiosis anyway
00:02 purpleidea fubada: can you send me your manifest...
00:07 * semiosis has seen semiosis
00:07 semiosis DurzoAU: see the new ~gluster ,,(ppa)
00:07 glusterbot DurzoAU: The official glusterfs packages for Ubuntu are available here: 3.4: http://goo.gl/M9CXF8 3.5: http://goo.gl/6HBwKh 3.6: http://goo.gl/XyYImN -- See more PPAs for QEMU with GlusterFS support, and GlusterFS QA releases at https://launchpad.net/~gluster -- contact semiosis with feedback
00:07 semiosis the old ~semiosis PPAs are no longer being updated
00:08 DurzoAU semiosis, ohai2u, i just found it.. have been visiting your ppa url for weeks now hoping.. silly me didnt bother looking around
00:08 semiosis DurzoAU: yep, read that after i started talking
00:08 semiosis glad you found it!
00:08 DurzoAU maybe you should update the ppa description with new url
00:08 semiosis i need to post some notes or delete the old ones
00:08 semiosis will do
00:08 purpleidea fubada: i think i might have found the issue, but will need your manifest to confirm
00:08 DurzoAU im probably not the only retarded person who doesnt search
00:10 semiosis lol!  [18:53] <purpleidea> DurzoAU: he's a volunteer! keep that in mind :)
00:10 semiosis thanks for covering for me buddy
00:10 semiosis but yeah hopefully i'll never let things go *that* long
00:10 purpleidea semiosis: hehe yw
00:11 semiosis (4+ months)
00:13 DurzoAU is it possible to upgrade 3.5.2 to 3.6.1 without downtime? i notice the "rolling upgrade" option has been removed from the community upgrade docs
00:13 CyrilPeponnet DurzoAU I test it today and it seems not
00:13 CyrilPeponnet mixing 3.6/3.6 nodes is not working fine
00:14 CyrilPeponnet 3.5/3.6
00:14 fubada purpleidea: coming up
00:14 DurzoAU thx
00:16 fubada purpleidea: https://gist.github.com/aamerik/d98dcf6f990f19120bcb
00:17 fubada then i have a couple of defines
00:17 fubada do you need?
00:17 purpleidea fubada: aha, so what's in build_bricks/volume ...
00:17 purpleidea fubada: yeah...
00:17 fubada https://gist.github.com/aamerik/d98dcf6f990f19120bcb
00:18 purpleidea fubada: it's strange, because the code is acting like you didn't define host or bricks or something...
00:19 fubada strange, i have the peers all talking to eachother
00:19 fubada i have another usecase, different manifest
00:19 fubada same error
00:20 fubada one second
00:20 purpleidea fubada: if you want to look at the code i can show you what's happening...
00:20 CyrilPeponnet one thought should again must not be enable to make it works with vrrp default settings?
00:20 fubada purpleidea: https://gist.github.com/aamerik/c2a6bd7d4728e76ddac2 another manifest
00:21 fubada purpleidea: sure
00:21 purpleidea CyrilPeponnet: $again turns exec['again'] on or off
00:22 purpleidea CyrilPeponnet: https://ttboj.wordpress.com/2014/03/24/introducing-puppet-execagain/
00:22 CyrilPeponnet yes  Ik now :)
00:22 purpleidea CyrilPeponnet: maybe i misunderstood the question/comment sorry
00:23 purpleidea fubada: that should work... hmm strange. let me dig more
00:23 CyrilPeponnet it was only a tough about fubada issue
00:23 purpleidea CyrilPeponnet: can you rephrase it?
00:24 CyrilPeponnet well I don't know what is the issue, but according to the latest manifest provided again is set to false. On my own manifest which is almost the same, everything works fine.
00:26 purpleidea CyrilPeponnet: ah, i think it's unrelated to $again
00:27 purpleidea CyrilPeponnet: are you using gluster::simple ?
00:27 CyrilPeponnet nope
00:28 purpleidea fubada: okay, i found one bug
00:28 CyrilPeponnet one quick question, one volume with 3 bricks should be set as replica 2 or 3 ?
00:28 fubada :)
00:29 purpleidea CyrilPeponnet: depends on # of hosts
00:29 CyrilPeponnet 3
00:29 purpleidea 3x3 can be a replica 3 if you want
00:29 purpleidea fubada: okay, so do you see all three folders getting removed, or just vrrp ?
00:29 purpleidea you pasted both of those different scenarios
00:29 fubada seems like just vrrp
00:30 purpleidea https://gist.github.com/aamerik/64e210384a0e4c9cbaa1
00:30 fubada let me check
00:30 purpleidea https://gist.github.com/aamerik/1ab19505ca95692e7449
00:30 purpleidea the second case of just vrrp changing and getting deleted is a bug
00:30 purpleidea subtle, not dangerous, but legitimate
00:30 purpleidea good catch
00:31 purpleidea the first case of the three folders dissapearing is a case of an illogical manifest (user error) afaict for now.
00:31 purpleidea 131683
00:31 purpleidea ^ ignore 131...
00:31 fubada im checking the first case, i think it happened on a machine that only uses gluster mount
00:31 purpleidea fubada: ah
00:31 purpleidea fubada: yeah, that could be
00:31 fubada yes, on a mount-only host, i see 3
00:31 purpleidea fubada: so that's another bug
00:31 purpleidea :P
00:31 fubada cool :)
00:32 purpleidea fubada: good catches
00:32 fubada yep got https://gist.github.com/aamerik/63844c4e013e9c1077d7 again
00:32 fubada purpleidea: foreman reports makes it real easy
00:33 fubada i was wondering why hosts are reporting changes when nothing new was checked into my puppet repos
00:33 purpleidea fubada: nope, it's legit, it's a good bug report. thanks
00:33 fubada thank you!
00:33 purpleidea fubada: two options: i can explain the fix, and you can try and write the patch
00:33 fubada sure
00:33 purpleidea fubada: or you can wait, and i'll try and do it this week?
00:33 fubada I can also wait, but if you have time I'd like to try
00:34 purpleidea fubada: it might be a tricky patch, so i'll explain, and worst case scenario it's a puppet learning experience
00:34 purpleidea okay
00:34 purpleidea so 1) vrrp case:
00:35 purpleidea if $vrrp is false: https://github.com/purpleidea/puppet-gluster/blob/master/manifests/host.pp#L215
00:35 purpleidea then this folder doesn't get created: https://github.com/purpleidea/puppet-gluster/blob/master/manifests/host.pp#L222
00:35 purpleidea and the parent folder purges it. (which is correct)
00:36 purpleidea the fact will mkdir to make sure it can operate on that folder, in case it didn't exist (because facts run before the manifest applies)
00:36 purpleidea so we need to probably just refactor the vrrp folder definition into a new class and always include it
00:36 purpleidea actually it's an easy fix.
00:37 fubada you mean 222:228?
00:37 purpleidea for the gluster::mount case, causing three of these, that's because of the same scenario, except that there is not host definition: https://github.com/purpleidea/puppet-gluster/blob/master/manifests/host.pp#L61
00:37 purpleidea so you'll never get those folders... but the fact still runs...
00:38 purpleidea for that, unfortunately, we probably have to facter it out into a separate "folders" sort of class, and include it in the mount class/type too...
00:38 purpleidea fubada: yep
00:38 fubada purpleidea: facts here are gluster facts?
00:38 fubada or puppet facts
00:39 fubada sorry stupid question
00:39 fubada # if empty, puppet will attempt to use the gluster fact
00:39 fubada threw me off
00:39 purpleidea fubada: yep, example:https://github.com/purpleidea/puppet-gluster/blob/master/lib/facter/gluster_vrrp.rb#L56
00:40 purpleidea example https://github.com/purpleidea/puppet-gluster/blob/master/lib/facter/gluster_vrrp.rb#L56
00:41 fubada okay thanks for explaining :)
00:41 purpleidea fubada: want to try the patches?
00:41 purpleidea err i mean try writing them?
00:41 fubada ill try
00:41 purpleidea fubada: okay, ping me if you're fed up or for review...
00:42 fubada the second one i'll have trouble with
00:42 fubada ill try the first
00:42 purpleidea fubada: if you hate it, i'll write them next week or this week
00:42 fubada thanks purpleidea
00:42 purpleidea fubada: yeah, exactly.
00:42 purpleidea the vrrp should be easiest
00:42 purpleidea the more general solution with the three folder case is actually not so bad, just a bit more involved.
00:42 purpleidea fubada: g2g ttyl
00:42 fubada cya
00:44 DurzoAU ,,paste
00:45 DurzoAU bleh
00:45 DurzoAU glusterbot, bleep bloop
00:46 DurzoAU can someone tell me why this is failing when trying to stop a geo-repl volume in 3.5? http://fpaste.org/166046/14205051/raw/
00:52 DurzoAU nvm fixed by using IP instead of hostname
01:03 fubada purpleidea: after factoring the vrrp dir create into its own class, do I need to still purge?
01:04 fubada looks like I fixed it, but I dropped the purge
01:05 fubada purpleidea: like so https://gist.github.com/aamerik/e922e8508711725d1a82
01:08 ubungu joined #gluster
01:16 ubungu joined #gluster
01:22 chirino joined #gluster
01:34 ubungu joined #gluster
01:47 ubungu joined #gluster
01:55 harish joined #gluster
01:55 plarsen joined #gluster
01:57 MattJ_EC joined #gluster
02:15 haomaiwang joined #gluster
02:19 quydo joined #gluster
02:25 neoice joined #gluster
02:34 T3 joined #gluster
02:37 nangthang joined #gluster
02:48 meghanam joined #gluster
02:52 hagarth joined #gluster
02:53 badone joined #gluster
03:03 M28 joined #gluster
03:09 cyberbootje joined #gluster
03:17 glusterbot News from newglusterbugs: [Bug 1147236] gluster 3.6 compatibility issue with gluster 3.3 <https://bugzilla.redhat.com/show_bug.cgi?id=1147236>
03:17 glusterbot News from newglusterbugs: [Bug 1179050] gluster vol clear-locks vol-name path kind all inode return IO error in a disperse volume <https://bugzilla.redhat.com/show_bug.cgi?id=1179050>
03:17 glusterbot News from resolvedglusterbugs: [Bug 1117822] Tracker bug for GlusterFS 3.6.0 <https://bugzilla.redhat.com/show_bug.cgi?id=1117822>
03:17 quydo joined #gluster
03:42 itisravi joined #gluster
03:44 bharata joined #gluster
03:46 ppai joined #gluster
03:49 calisto joined #gluster
03:50 quydo joined #gluster
03:57 kanagaraj joined #gluster
03:58 calisto joined #gluster
03:59 nbalacha joined #gluster
04:05 iPancreas joined #gluster
04:07 sputnik13 joined #gluster
04:07 hagarth joined #gluster
04:16 spandit joined #gluster
04:19 RameshN joined #gluster
04:28 vimal joined #gluster
04:33 shubhendu joined #gluster
04:38 jiffin joined #gluster
04:39 anoopcs joined #gluster
04:48 ndarshan joined #gluster
04:55 rafi1 joined #gluster
04:56 jiku joined #gluster
05:01 lalatenduM joined #gluster
05:05 nishanth joined #gluster
05:06 iPancreas joined #gluster
05:11 nrcpts joined #gluster
05:12 bala joined #gluster
05:12 T3 joined #gluster
05:16 atinmu joined #gluster
05:21 hagarth joined #gluster
05:27 dkelson joined #gluster
05:30 pp joined #gluster
05:32 kdhananjay joined #gluster
05:32 soumya_ joined #gluster
05:33 deepakcs joined #gluster
05:34 sahina joined #gluster
05:37 kumar joined #gluster
05:45 flu_ joined #gluster
05:46 dusmant joined #gluster
05:46 PaulCuzner joined #gluster
05:47 nshaikh joined #gluster
05:47 flu_ anyone could help? For when I using gluster 3.5/3.6 clients failed to write anything to a 3.3 volume (there are some legacy volumes in production environment, and could not be updated currently). You could read, create, delete a file, but you can't write...
05:56 kshlm joined #gluster
05:56 flu_ from the client log, I could see: W [fuse-bridge.c:2242:fuse_writev_cbk] 0-glusterfs-fuse: 224: WRITE => -1 (Transport endpoint is not connected)  W [fuse-bridge.c:1236:fuse_err_cbk] 0-glusterfs-fuse: 225: FSYNC() ERR => -1 (Transport endpoint is not connected) W [fuse-bridge.c:1236:fuse_err_cbk] 0-glusterfs-fuse: 226: FLUSH() ERR => -1 (Transport endpoint is not connected)
05:58 dkelson what are the best practices regarding bricks with servers with many multiple physical disks. RAID everything have one brick per server?
06:00 overclk joined #gluster
06:07 iPancreas joined #gluster
06:12 rjoseph joined #gluster
06:15 dusmant joined #gluster
06:17 glusterbot News from newglusterbugs: [Bug 991084] No way to start a failed brick when replaced the location with empty folder <https://bugzilla.redhat.com/show_bug.cgi?id=991084>
06:19 hagarth joined #gluster
06:26 suman_d joined #gluster
06:28 hchiramm_ joined #gluster
06:39 saurabh joined #gluster
06:43 atalur joined #gluster
06:45 nangthang joined #gluster
06:45 raghu joined #gluster
06:49 anil joined #gluster
07:00 anrao joined #gluster
07:07 iPancreas joined #gluster
07:08 maveric_amitc_ joined #gluster
07:08 anrao joined #gluster
07:09 gem joined #gluster
07:10 Manikandan_ joined #gluster
07:11 kovshenin joined #gluster
07:11 jtux joined #gluster
07:12 ppai joined #gluster
07:12 T3 joined #gluster
07:17 glusterbot News from newglusterbugs: [Bug 1179076] [gluster-nagios] Monitor txerr/rxerr for network interfaces <https://bugzilla.redhat.com/show_bug.cgi?id=1179076>
07:23 anrao joined #gluster
07:24 deepakcs joined #gluster
07:31 JoeJulian bharata-rao: I built qemu with glusterfs support, but planned on not installing libgfapi right away but just having it ready in case I can convince people to switch back. I was surprised that it wouldn't start without the library installed. Couldn't you just check to see if a supporting dynamic library is installed at runtime, and if it's not, just disable that feature?
07:56 mbukatov joined #gluster
08:00 ppai joined #gluster
08:01 harish joined #gluster
08:04 isadmin1 joined #gluster
08:08 iPancreas joined #gluster
08:08 SOLDIERz joined #gluster
08:08 fandi joined #gluster
08:20 maveric_amitc_ joined #gluster
08:23 hagarth joined #gluster
08:31 Pupeno joined #gluster
08:32 anoopcs joined #gluster
08:40 ppai joined #gluster
08:41 Slashman joined #gluster
08:43 deniszh joined #gluster
08:48 anrao joined #gluster
08:57 ade_b joined #gluster
08:57 rjoseph joined #gluster
08:57 ade_b hi all, Im running oVirt on top of gluster and it works really nicely, but I do get one repeating log entry Id like to resolve ...
08:57 ade_b Unable to get lock for uuid: afd1c0f5-129b-4856-abdc-d6b15567dd51, lock held by: afd1c0f5-129b-4856-abdc-d6b15567dd51
08:58 ade_b can anyone show me how I go about resolving this ?
09:02 ndevos ade_b: I think that is bug 1154635
09:02 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1154635 high, unspecified, ---, amukherj, POST , glusterd: Gluster rebalance status returns failure
09:02 * ade_b reads
09:02 ndevos fixed with http://review.gluster.org/9012
09:02 ndevos but atinmu probably knows better than me
09:04 ndevos ade_b: well, that still can happen if you run multiple 'gluster volume status' or other commands at the same time - some commands take a gluster wide lock, and while that is kept, other commands will fail
09:04 toad joined #gluster
09:04 ade_b ndevos, ok yes it looks to be the same, so there should be a new package available soon ?
09:04 atinmu ade_b, is any of your command failing saying "Another transaction is in progress"
09:05 ade_b so I think that oVirt runs a command that leads to this
09:05 ade_b I know if I ask for a volume status or soemthing on one node it works, on the ovirt-management node it doesnt
09:06 ade_b " gluster volume status - Locking failed on node2. Please check log file for details."
09:06 nshaikh joined #gluster
09:07 ndevos atinmu: does http://review.gluster.org/9012 get backported to 3.6, 3.5 and maybe 3.4 too? or is that a mainline/3.7 change only?
09:07 atinmu ndevos, just a minute, will get back
09:07 ndevos atinmu: sure, not need to hurry :)
09:07 ade_b yikes, Im also seeing this in the logs http://fpaste.org/166150/53522214/   (but oVirt seems to be working ok)
09:08 iPancreas joined #gluster
09:09 ndevos ade_b: I dont think you need to 'yikes' that :) it's mostly messaged in Debug mode, so not really fatal
09:10 ndevos although I dont see why the 'E'rror message about 'returning 0' would point to an issue
09:10 gem joined #gluster
09:10 ade_b ndevos, phew, good to know. It "looked" scary :)
09:11 ade_b I do have a vdsm-gluster update available
09:12 ndevos ade_b: hehe, yes, looks scary, but really isnt
09:12 ade_b Once my current workload is finished I shall update those ovirt packages, thanks guys
09:18 ghenry joined #gluster
09:18 ghenry joined #gluster
09:19 anrao joined #gluster
09:20 dusmant joined #gluster
09:24 warci joined #gluster
09:26 warci hi all, is there some way to disable krb authentication on the nfs server side
09:27 warci apparently the gluster server in 3.5 announces it can authenticate krb5, but when a windows client tries to use that authentication it fails to connect
09:27 warci which is normal as we don't have krb5 configured
09:29 Dw_Sn joined #gluster
09:32 atinmu ndevos, I think the problem what he ade_b is facing is addressed at http://review.gluster.org/#/c/9269/
09:34 ndevos atinmu: sounds possible to me, and it seems that bug 1176756 is used for getting the patch in 3.6
09:34 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1176756 unspecified, unspecified, ---, amukherj, POST , glusterd: remote locking failure when multiple synctask transactions are run
09:36 ndevos warci: uh, no, gluster/nfs does not support krb... I dont know how it would announce that it supports it?
09:37 ndevos warci: you dont happen to have the kernel nfs server running? it would conflict with gluster/nfs and will surely result in weird errors
09:38 saurabh joined #gluster
09:39 atinmu ndevos, yes
09:39 warci ndevos: no, nfs is stopped. Well they did a network sniff and saw this... i'll check if i can get a hold of the trace
09:40 atinmu ndevos, however I am evaluating whether we should backport http://review.gluster.org/9012 to 3.6
09:41 ade_b thanks atinmu
09:41 ndevos atinmu: yes, I think a backport to 3.6 would be good, at least one user did its own backport (mentioned in the bugzilla)
09:43 SOLDIERz joined #gluster
09:43 ndevos warci: gluster/nfs only supports NFSv3, it is uncommon to use that with Kerberos - trying to use Kerberos should result in a failure and an other auth mechanism should be tried by the nfs-client
09:43 ndevos but well, *should*... I did not explicitly test that myself
09:48 glusterbot News from newglusterbugs: [Bug 1166515] [Tracker] RDMA support in glusterfs <https://bugzilla.redhat.com/show_bug.cgi?id=1166515>
09:48 nishanth joined #gluster
09:49 dusmant joined #gluster
09:50 warci ndevos: i'm compiling a google doc of the stuff we've seen
09:53 bala joined #gluster
09:54 soumya_ joined #gluster
09:57 nbalacha joined #gluster
09:59 atinmu ndevos, I've backported the fix to 3.6
09:59 atinmu ndevos, http://review.gluster.org/#/c/9393/
09:59 ndevos atinmu++ awesome, thank you!
10:00 glusterbot ndevos: atinmu's karma is now 3
10:02 warci ndevos: https://docs.google.com/document/d/1YVAiCdS1CQ0jnrI0DWPrLVr8uOILCiw1zaLJHgOBUEc/edit?usp=sharing
10:02 warci can you read this doc?
10:03 warci its a conversation with microsoft support
10:03 purpleidea fubada: yes, you need to purge, you need to use tabs, it's a typo for false you made anyways, but it should be true, and the class name should be whatever::base
10:03 ndevos warci: yeah, I can access that
10:04 warci i don't know if it's clear ...
10:06 ndevos warci: yes, its quite clear - you get "procedure cant decode params", maybe a different error code should get returned - I'll have to check the RFC for that
10:06 ndevos warci: but, it is very strange to switch from one auth scheme to an other...
10:07 deniszh1 joined #gluster
10:08 nbalacha joined #gluster
10:09 warci yes... i really don't get what's going on... but ofcourse, it's the microsoft nfs client so...
10:09 iPancreas joined #gluster
10:11 rjoseph joined #gluster
10:16 ndevos warci: http://tools.ietf.org/html/rfc5531#section-9 contains the possible responses, you get MSG_ACCEPTED/GARBAGE_ARGS as a response on AUTH_KERB - maybe that MSG_DENIED/AUTH_ERROR/AUTH_FAILED would be more correct
10:16 warci ndevos: aha.... can i do anything about this? except modify the source :)
10:19 ndevos warci: well, I'm not sure if that would improve things for you... just tell your windows to stick to AUTH_UNIX/AUTH_SYS and dont try AUTH_KERB
10:19 ndevos :P
10:20 warci ndevos: yeah, we can mess with the clients, but the weird thing is that we didn't have this issue with gluster 3.4
10:21 ndevos warci: definitely weird, so the same clients can use gluster-nfs from 3.4 without issue?
10:21 warci yes... we've used them for almost a year without issues
10:22 warci then we upgraded gluster to 3.5 and now this pops up
10:22 ndevos hmm
10:22 ndevos I do not think the AUTH_* handling has changed, but maybe it did...
10:23 ghenry joined #gluster
10:23 warci don't have a 3.4 server running anymore so can't do a network trace right away...
10:23 Manikandan joined #gluster
10:31 flu_ joined #gluster
10:34 ndevos warci: I dont see any changes to the auth validation code that would explain the difference between 3.4 and 3.5
10:36 warci bizarre...
10:40 Dw_Sn joined #gluster
10:41 ndevos warci: maybe trying AUTH_KERB is tried becausome something in gluster-nfs behaves different, but I can not think of anything related
10:43 warci ok... so, we have no choice but to modify all the clients... joy :)
10:43 warci thx anyway for the investigation
10:47 sahina joined #gluster
10:47 ndevos warci: I think we still should improve the handling of unsupported AUTH_* schemes in our Gluster RPC layer - MSG_DENIED/AUTH_ERROR/AUTH_FAILED looks more correct than a generic parsing error
10:48 warci yeah, probably the microsoft client's handling is not awesome either :) because i don't have any issues with my linux clients
10:48 nishanth joined #gluster
10:48 warci but ofcourse microsoft blames the gluster server and our ticket with them was closed
10:49 ndevos warci: they blame an NFS server because it does not support AUTH_KERB?
10:50 ndevos warci: I would blame them because they switch from a working AUTH_UNIX connection to a non-working AUTH_KERB
10:51 warci no, because according to them the server advertises it supports the auth method but then it claims it doesn't
10:53 ndevos warci: uh, its news to me that an NFS-server can return a list of supported AUTH-schemes
10:53 warci :D
10:53 warci no comment....
10:53 ndevos warci: did you get any details on why the nfs-client thinks AUTH_KERB is supported?
10:55 warci my colleage followed, the case, i'll ask him
10:55 warci colleague here :)
10:56 warci we didn't get any feedback yet on why the client thinks that AUTH_KERB is supported, but somehow we notice in the network traces that it starts as AUTH_UNIX, and then switches to AUTH_KERB
10:57 warci we supose the server somehow presents the KERBEROS option as available, and since the client supports it starts
10:58 warci I'm not sure if we can find that out in the network traces?
10:59 ndevos no, the server does not send that...
11:00 ndevos there might be a trigger in the NFS-client that causes it to thinkt that AUTH_KERB is better... but I would not know what that trigger would be
11:00 nbalacha joined #gluster
11:00 ndevos maybe some property that gets set in the GETATTR reply... or maybe something related to ACLs
11:01 ndevos if you still have the network traces, you could compare the GETATTR contents of the working and non-working clients
11:02 ndevos that is, if the GETATTR is indeed the last call/reply pair that was done with AUTH_UNIX
11:02 ghenry joined #gluster
11:02 T3 joined #gluster
11:05 dusmant joined #gluster
11:07 jvandewege Hi All, Are there plans to add a logrotate for cli.log? Asking because when gluster is used together with oVirt it grows to gigantic proportions.
11:08 warci we have traces of a notworking client (when all security flavors are enabled on the client)
11:08 sahina joined #gluster
11:08 warci and we have also the traces of a working client
11:09 warci the info is in the first two screenshots on the google doc
11:10 iPancreas joined #gluster
11:11 harish joined #gluster
11:11 atalur joined #gluster
11:11 ndevos warci: yes, but the contents of the GETATTR might point to something - ah, no, it would not, its the same filehandle
11:15 ndevos warci: got a link to how to enable/disable security flavors on a MS Windows NFS-client?
11:16 ndevos warci: oh, and what version of windows is that?
11:19 rolfb joined #gluster
11:29 kovshenin joined #gluster
11:30 kkeithley1 joined #gluster
11:42 ndevos warci: you may want to watch bug 1179179 and give it some more input in case I missed anything
11:42 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1179179 medium, low, ---, bugs, NEW , When an unsupported AUTH_* scheme is used, the RPC-Reply should contain MSG_DENIED/AUTH_ERROR/AUTH_FAILED
11:42 calisto joined #gluster
11:44 nshaikh joined #gluster
11:45 meghanam joined #gluster
11:46 ndevos warci: it would be interesting to know if there are error messages in /var/log/glusterfs/nfs.log related to rpcsvc.c:rpcsvc_request_create or something like that
11:46 calisto1 joined #gluster
11:46 bene joined #gluster
11:47 Norky joined #gluster
11:49 glusterbot News from newglusterbugs: [Bug 1179179] When an unsupported AUTH_* scheme is used, the RPC-Reply should contain MSG_DENIED/AUTH_ERROR/AUTH_FAILED <https://bugzilla.redhat.com/show_bug.cgi?id=1179179>
11:50 ghenry joined #gluster
11:50 ghenry joined #gluster
11:53 ndevos REMINDER: Gluster Community Bug Triage meeting starts in 7 minutes in #gluster-meeting
11:53 atalur joined #gluster
11:56 spandit joined #gluster
12:04 itisravi_ joined #gluster
12:05 RameshN joined #gluster
12:06 LebedevRI joined #gluster
12:06 jdarcy joined #gluster
12:07 hagarth joined #gluster
12:10 iPancreas joined #gluster
12:11 neoice joined #gluster
12:12 SOLDIERz joined #gluster
12:18 glusterbot News from newglusterbugs: [Bug 1154491] split-brain reported on files whose change-logs are all zeros <https://bugzilla.redhat.com/show_bug.cgi?id=1154491>
12:18 glusterbot News from newglusterbugs: [Bug 1167012] self-heal-algorithm with option "full" doesn't heal sparse files correctly <https://bugzilla.redhat.com/show_bug.cgi?id=1167012>
12:18 glusterbot News from newglusterbugs: [Bug 1113907] AFR: Inconsistent GlusterNFS behavior v/s GlusterFUSE during metadata split brain on directories <https://bugzilla.redhat.com/show_bug.cgi?id=1113907>
12:19 harish joined #gluster
12:25 meghanam joined #gluster
12:27 nangthang joined #gluster
12:35 ira joined #gluster
12:42 calisto joined #gluster
12:44 hybrid512 joined #gluster
12:49 glusterbot News from newglusterbugs: [Bug 1179208] Since 3.6; ssl without auth.ssl-allow broken <https://bugzilla.redhat.com/show_bug.cgi?id=1179208>
12:52 aravindavk joined #gluster
12:54 ppai joined #gluster
12:56 warci ndevos: sorry, was off to emergency lunch :)
12:58 Fen1 joined #gluster
13:02 warci thanks for creating the bug report! i'll see about adding the other info you requested
13:05 hybrid512 joined #gluster
13:05 T3 joined #gluster
13:06 Slashman joined #gluster
13:11 iPancreas joined #gluster
13:11 fandi joined #gluster
13:15 B21956 joined #gluster
13:19 glusterbot News from newglusterbugs: [Bug 1177167] ctdb's ping_pong lock tester fails with input/output error on disperse volume mounted with glusterfs <https://bugzilla.redhat.com/show_bug.cgi?id=1177167>
13:27 _Bryan_ joined #gluster
13:47 lpabon joined #gluster
13:49 glusterbot News from newglusterbugs: [Bug 1177411] use of git submodules blocks automatic ubuntu package builds <https://bugzilla.redhat.com/show_bug.cgi?id=1177411>
13:50 tdasilva joined #gluster
13:55 ndevos warci: sure, no problem, and thanks for the notes + log!
13:55 aravindavk joined #gluster
13:55 warci ndevos: no prob, if you need anything else, give me a shout
13:58 ndevos warci: would you be able to test a patch or build with changes to the rpc/auth part of gluster?
13:58 warci sure!
13:59 * ndevos does not have a windows system, and has no idea where to get it, or the unix services that have the nfs bits
13:59 ndevos cool!
14:05 prasanth_ joined #gluster
14:08 RaSTar joined #gluster
14:10 virusuy joined #gluster
14:11 iPancreas joined #gluster
14:20 calisto joined #gluster
14:20 dusmant joined #gluster
14:22 eclectic joined #gluster
14:36 RameshN joined #gluster
14:40 calisto joined #gluster
14:43 Dw_Sn joined #gluster
14:43 bene joined #gluster
14:45 Hoggins joined #gluster
14:50 SOLDIERz joined #gluster
14:50 hchiramm_ joined #gluster
14:51 Hoggins Hello all, I have an odd behavior I'm not yet able to understand, and I believe I found something strange on  my setup : two replicated nodes. When I ask gluster for the list of opened files, using "gluster volume top <volume> open" several times, the number of occurrences for some files increase rapidly, like if they were never released. Is it normal ?
14:58 mdavidson joined #gluster
15:00 gem joined #gluster
15:03 RameshN joined #gluster
15:09 nangthang joined #gluster
15:12 iPancreas joined #gluster
15:13 coredump joined #gluster
15:19 coredump joined #gluster
15:20 plarsen joined #gluster
15:25 afics joined #gluster
15:29 prasanth_ joined #gluster
15:31 thangnn_ joined #gluster
15:32 thangnn_ joined #gluster
15:33 n-st joined #gluster
15:34 Telsin joined #gluster
15:35 lmickh joined #gluster
15:48 hagarth joined #gluster
15:49 bala joined #gluster
15:50 _Bryan_ joined #gluster
15:58 plarsen joined #gluster
16:04 nishanth joined #gluster
16:05 Fen1 joined #gluster
16:12 iPancreas joined #gluster
16:17 msp3k joined #gluster
16:18 bala joined #gluster
16:20 suman_d joined #gluster
16:25 nbalacha joined #gluster
16:26 msp3k Is there a way to stop gluster from using ports reserved for other services?
16:27 iPancreas joined #gluster
16:28 JoeJulian Hoggins: iirc, top doesn't reset the stats.
16:28 Hoggins JoeJulian: thank you, that's what I figured out, actually
16:30 Hoggins JoeJulian: I experience strange CPU peaks and clients unable to access the data unless I stop and start the volume, and I ran through stats to try to find out why, and that's when I found this top utility, and thought there was some kind of a leak, with fd never closing up. But I was wrong, obviously :)
16:31 JoeJulian msp3k: sysctl -w net.ipv4.ip_local_reserved_ports = ( your reserved port list )
16:32 JoeJulian Hoggins: What filesystem do you have on your bricks?
16:32 Hoggins er
16:32 msp3k JoeJulian: Thanks!
16:33 Hoggins JoeJulian,: XFS, as per the manual
16:33 JoeJulian Well that shoots down my one idea.
16:35 Hoggins I'm not sure, but everything was very stable before I upgraded the system of my VMs, the Gluster servers, and the clients
16:35 msp3k Is 3.6 the latest stable release?  Or is that a development release?
16:35 Hoggins they now all use 3.5.3, and I guess it was 3.4.2 before
16:36 JoeJulian 3.5 is considered the stable release.
16:36 msp3k Thanks
16:36 Hoggins I must admit I performed the upgrade of the bricks without being very cautious, so if there was a migration procedure, I'm sure I did not follow it :s
16:36 JoeJulian Heh, no, that should have been fine.
16:37 JoeJulian My only remaining guess is that you're hitting some sort of self-heal thing.
16:38 Hoggins yep, that's what I can see from time to time in the glusterfshd.log file, but I wonder why it would need any self-heal
16:38 JoeJulian Dirty xattrs somewhere is usually the reason for a heal to be needed but not show up in heal..info.
16:41 lalatenduM joined #gluster
16:42 jobewan joined #gluster
16:44 zutto Could someone share some knowledge on when ls fails on glusterfs mount? I have 6 brick cluster (3 x 2 setup) and on 5 of the bricks on the glusterfs mount, ls fails on the root directory for some reason
16:45 zutto strace on ls when it fails just spams getdents on non stop loop
16:46 roost joined #gluster
16:47 roost_ joined #gluster
16:49 roost_ left #gluster
16:51 roost_ joined #gluster
17:04 JoeJulian That sounds like the old ,,(ext4)) bug
17:04 glusterbot The ext4 bug has been fixed in 3.3.2 and 3.4.0. Read about the ext4 problem at http://goo.gl/Jytba or follow the bug report here http://goo.gl/CO1VZ
17:05 JoeJulian zutto: ^
17:05 zutto oh, thanks!
17:05 zutto all the bricks are running 3.4.0alpha, so that is probably it!
17:10 zutto wait wait wait, it cant be that same bug actually
17:11 zutto unless i got alpha that didnt contain the bug
17:12 zutto s/bug/bugfix/
17:12 glusterbot What zutto meant to say was: An error has occurred and has been logged. Please contact this bot's administrator for more information.
17:14 bennyturns joined #gluster
17:16 JoeJulian could be. Why are you running an old alpha release anyway?
17:16 zutto i just deployed the cluster and left it running its own life on the background
17:16 zutto now that i actually need to deal with the files on it i found this issue :P
17:17 JoeJulian so upgrade?
17:17 zutto i should
17:17 JoeJulian It's not like it's a production system.
17:17 zutto its almost production system
17:17 zutto just betatesting the software with few clients
17:17 NuxRo can I kill glustershd and trust that it will be automatically started if needed? I'm kind of short on memory and it's using a bit
17:17 * JoeJulian shakes his finger at zutto
17:18 JoeJulian NuxRo: not really, but you can kill it and restart glusterd to get it to start again.
17:18 NuxRo aha, thanks
17:19 JoeJulian zutto: seriously, please don't run that in prod. That's got some seriously bad bugs.
17:20 zutto by the looks of it, it wont get that far
17:22 NuxRo JoeJulian: /etc/init.d/glusterd will not start now, do you know which log file is relevant for it?
17:22 JoeJulian /var/log/glusterfs/etc-glusterd.vol.log
17:23 JoeJulian If it's useless, just try starting it from the command line in debug mode, "glusterd -d"
17:24 NuxRo yep, that was my next move, /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
17:25 JoeJulian No need to bother with the pid file if you're doing -d. It'll be foreground anyway.
17:26 NuxRo xlator.c:390:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
17:26 NuxRo graph.c:292:glusterfs_graph_init] 0-management: initializing translator failed
17:26 NuxRo sounds serious
17:26 NuxRo graph.c:479:glusterfs_graph_activate] 0-graph: init failed
17:30 lalatenduM joined #gluster
17:38 NuxRo E [glusterd-store.c:1845:glusterd_store_retrieve_volume] 0-: Unknown key: brick-3
17:38 NuxRo D [store.c:566:gf_store_iter_get_next] 0-: Returning with -1
17:38 NuxRo I am seeing this kind of stuff in the --debug output
17:38 NuxRo rings any bells to anyone?
17:40 NuxRo all volumes have info files with brick-0 to brick-x defined, why would they appear as unknwon key?
17:43 hagarth NuxRo: check if hostnames used in volume definition are resolvable on the node where glusterd is failing to start
17:45 JoeJulian unknown key always happens. It's a false error.
17:45 NuxRo hagarth: they are all pingable (resolvable)
17:46 NuxRo JoeJulian: so you suggest that is not the actual problem? (btw 3.4.0 here)
17:47 JoeJulian I do. Paste your "glusterd -d" results in fpaste.org and I'll take a peek.
17:47 hagarth NuxRo: check /var/lib/glusterd/vols/<volname>/info to see if hostnames for brick0 to brickN are appropriate & resolvable.
17:48 JoeJulian hagarth: Those are false errors. That happens every time I start glusterd on any server.
17:49 hagarth JoeJulian: yes but the error NuxRo seems to be experiencing is seen usually when there are issues in resolving a brick's hostname..
17:49 NuxRo JoeJulian: http://tmp.nux.ro/7kX-debug.log
17:49 hagarth anyway glusterd --debug output should be useful
17:50 NuxRo hagarth: yep, see above
17:50 NuxRo i will check info files meanwhile
17:50 JoeJulian or when a volume hash mismatch, or a full /var/log, or...
17:52 hagarth hmm, this seems to point to a failure in retrieving peers
17:52 hagarth [2015-01-06 17:47:46.206214] D [glusterd-store.c:2449:glusterd_store_retrieve_peers] 0-: Returning with -1
17:55 NuxRo hagarth: any way I can get more info about that? all other nodes are up and running, with glusterd running
17:55 NuxRo i also checked the info files and the hostnames are specified correctly
17:56 rjoseph joined #gluster
17:56 hagarth NuxRo: check /var/lib/glusterd/peers/*
17:56 hagarth and see if that looks fine
18:00 NuxRo hagarth: hmm, one of the files (94661455-585a-4e09-adc1-5a011f689769) is empty, the other 2 seem to have some valid data
18:02 NuxRo can I just try to copy that file from another peer?
18:03 hagarth NuxRo: yes, you should be able to
18:04 NuxRo hagarth: did that and glusterd started fine
18:04 NuxRo this is so odd
18:04 NuxRo thanks for the assistance guys, i would have had no clue to check that :)
18:05 hagarth NuxRo: glad that it is working :)
18:07 NuxRo hagarth: ah, one more thing, it would have been too nice to just work asap :)
18:07 NuxRo gluster volume status only lists the local bricks
18:08 hagarth NuxRo: check if gluster peer status lists other peers as connected
18:08 NuxRo gluster peer status
18:08 NuxRo peer status: No peers present
18:08 NuxRo weird, the other peers report it
18:09 hagarth NuxRo: hmm, check content of /var/lib/glusterd/peers/* again
18:09 NuxRo hagarth: holy smokes, it's empty!
18:09 NuxRo 0 files in /var/lib/glusterd/peers/
18:10 hagarth NuxRo: that looks odd, are you running out of space on /var by any chance?
18:11 NuxRo nope, plenty space remaining
18:11 NuxRo i can create new files in that dir
18:11 JoeJulian selinux?
18:11 NuxRo non-empty that is
18:11 NuxRo disabled
18:12 NuxRo should I try to recreate the files
18:12 NuxRo ?
18:12 NuxRo i can recover them from other nodes
18:12 hagarth I don't find any unlinking of files in /var/lib/glusterd/peers unless a detach happens
18:12 hagarth NuxRo: yes, but please avoid copying file corresponding to uuid of this node
18:13 NuxRo ok
18:13 hagarth NuxRo: you would need to copy distinct files from 2 nodes to ensure that all peers are populated back
18:17 NuxRo hagarth: yep, did that. restarting glusterd once more
18:18 NuxRo ok, recreated the files, but now volume status stil only shows the local brick
18:19 NuxRo hm, there is a difference in the peer files
18:19 hagarth NuxRo: check peer status again
18:19 NuxRo on this weird node the state=4
18:19 NuxRo in the peer files, on the "good" nodes state=3
18:19 NuxRo ok, so the peers are listed as: State: Accepted peer request (Connected)
18:20 NuxRo while on the other nodes it is "State: Peer in Cluster (Connected)"
18:21 ricky-ti1 joined #gluster
18:21 hagarth NuxRo:  does peer status on a different node list the problematic node as ""State: Peer in Cluster (Connected)"
18:21 NuxRo State: Peer in Cluster (Connected)
18:22 NuxRo on all 3 other nodes the problematic node is listed as that
18:23 hagarth does UUID in /var/lib/glusterd/info on the problematic node match the one listed in the peer status output on a different node
18:23 NuxRo you mean /var/lib/glusterd/glusterd.info
18:23 NuxRo checking
18:24 hagarth yes
18:25 NuxRo nope, that file reports UUID=c1474c29-1b43-44a9-8580-10f2cf73bc5e, but this uuid does not appear in the peers in the other nodes
18:25 NuxRo should I change the UUID to what the other nodes expect?
18:26 hagarth NuxRo: yes, and mv files with this <uuid> from /var/lib/glusterd/peers/ on other nodes.
18:26 NuxRo hagarth: this uuid is not present at all on the others
18:27 hagarth NuxRo: check in /var/lib/glusterd/peers/ now, I believe you will have a file with this uuid
18:29 NuxRo hagarth: i have changed UUID in the glusterd.info of the problematic nodes with what the other nodes expected
18:29 NuxRo and volume status now shows all bricks
18:29 NuxRo the old UUID=c1474c29-1b43-44a9-8580-10f2cf73bc5e is nowhere to be seen
18:30 NuxRo i have no idea how this happened
18:30 meghanam joined #gluster
18:30 hagarth NuxRo: most likely a disk full or glusterd terminating while it was in the middle of writing state to disk or something else :)
18:31 NuxRo hehe
18:31 NuxRo well, it seems to be fine now
18:31 NuxRo thanks again, hopefully this was the last of it :)
18:31 hagarth NuxRo: hope so, I shall be off now. Have fun :)
18:31 NuxRo btw, where can I read about some best practices re log rotation?
18:32 NuxRo the logs do tend to get quite big
18:32 NuxRo hagarth: cheers, you too :)
18:37 Telsin anyone tried a rolling upgrade from 3.5.3 to 3.6.1? how'd it go?
18:39 JoeJulian Telsin: I've heard of people doing it and haven't heard any complaints.
18:39 bet_ joined #gluster
18:46 Telsin think I'll be trying it after lunch then, I'll report back
18:46 bennyturns joined #gluster
18:48 bennyturns joined #gluster
19:05 julim joined #gluster
19:11 virusuy joined #gluster
19:44 shaunm joined #gluster
20:07 PeterA joined #gluster
20:07 PeterA is that possible to allow subdir be mount over NFS from gluster?
20:18 rjoseph joined #gluster
20:26 DV joined #gluster
20:30 kkeithley_ Gluster's current NFS server does not allow mounting subdirs. You can use the NFS-Ganesha server backed by a gluster volume and export subdirs using that. You can experiment with NFS-Ganesha now, there are Ganesha packages in Fedora with Gluster support built in, and for RHEL/CentOS you can get Ganesha RPMs with Gluster support from download.gluster.org.
20:32 PeterA thx
20:40 M28_ joined #gluster
20:41 iPancreas joined #gluster
20:52 MattJ_EC joined #gluster
20:52 side_control joined #gluster
21:07 neofob joined #gluster
21:11 Dw_Sn joined #gluster
21:21 badone joined #gluster
21:27 cfeller joined #gluster
21:41 lpabon joined #gluster
21:42 bene joined #gluster
21:42 iPancreas joined #gluster
21:43 msp3k left #gluster
22:01 semiosis oh yeah, i published a new feature release of the ,,(java) library yesterday.  pretty much all the new stuff was done by my students/mentees!
22:01 glusterbot https://github.com/semiosis/glusterfs-java-filesystem
22:10 iPancreas joined #gluster
22:16 T3 joined #gluster
22:18 eka joined #gluster
22:19 eka joined #gluster
22:20 eka hi all... I was wondering if I can use gluster with one server node alone... I have a NFS server that suffers from performance issues since we are sharing ~3M files with 3 other servers
22:20 eka all images
22:33 eka joined #gluster
22:34 eka sorry was disconnected
22:34 eka hi all... I was wondering if I can use gluster with one server node alone... I have a NFS server that suffers from performance issues since we are sharing ~3M files with 3 other servers
22:36 devilspgd With just a single brick?
22:37 devilspgd I'd be a little surprised if gluster would really help in that configuration, at least for me (tons of smallish files), NFS outperforms gluster.
22:38 eka devilspgd: so what would be better?
22:38 eka devilspgd: isn't gluster better in performance than NFS?
22:38 sputnik13 joined #gluster
22:38 eka devilspgd: maybe cluster config?
22:38 semiosis nfs doesn't really outperform gluster, it just caches more aggressively
22:39 semiosis a gluster volume of one brick doesn't make sense
22:39 eka semiosis: so what would make sense for this qty of f iles?
22:39 semiosis if you're outgrowing the abilities of a single server storage solution, you can use gluster to replace your single nfs server with a cluster
22:39 semiosis sure
22:40 semiosis oh, what would, not would that
22:40 eka semiosis: question, so the files will be in one 'brick' and when adding another one to the cluster they will sync? or how that work?
22:40 eka semiosis: ?
22:40 devilspgd semiosis: True enough, but the results are visible to users, in that the "Your new server farm is really slow" complaints turned into "Thanks for making it fast again" when I flipped from gluster's client to NFS.
22:40 devilspgd Same gluster server backing everything
22:41 semiosis well ideally you'd start with a new, empty, gluster volume and copy your data in
22:41 eka I see
22:41 eka semiosis: and that will keep all the bricks in sync right?
22:41 devilspgd I had some other design issues which added overhead and probably amplified the problem, but at least for me in a web server environment, NFS has been far snappier.
22:42 semiosis lets say you had two servers, with two hard drives each. you could set up a 2x2 volume, replicating between the servers & distributing over the hard drives
22:42 eka devilspgd: but how? now is killing my app tho
22:42 eka semiosis: 2 hard drives each?  ...
22:42 semiosis just for example
22:43 semiosis how many hdd does your current server have?  do you use raid?
22:43 eka semiosis: why 2 each? 1 hard drive not enough?
22:43 R0ok_ joined #gluster
22:43 eka semiosis: they are all VMs
22:43 semiosis oh well then
22:43 devilspgd I should mention that I did not directly test against moving everything to local storage, I knew that wouldn't scale and so I only really compared gluster FUSE, NFS client, and I did minor testing with NFS without gluster involved at all.
22:43 semiosis it all depends what your bottleneck is
22:43 semiosis two hdd is faster than one
22:43 devilspgd I settled on "Gluster is awesome, but small file performance isn't great, NFS is faster than gluster fuse"
22:44 eka semiosis: the current setup is a different volume holds all the files, it's mounted in a server that serves that via NFS and 3 different app servers mount that
22:44 semiosis sounds like what we had before we moved to gluster :)
22:45 eka semiosis: so I could create 3 VMs just for gluster ... and start copying all the files? that way?
22:45 eka but that will take ages
22:45 semiosis what kind of vm platform is this?
22:45 semiosis cloud?
22:45 eka semiosis: it's linode... SSD instances
22:46 semiosis idk much about linode, but i've been doing similar in AWS
22:46 semiosis two vms for gluster each with a bunch of virtual hdd attached
22:46 semiosis wasn't even ssd, but worked great
22:46 eka semiosis: just wondering... why I need more than 1 hdd?
22:46 devilspgd I've found Linode's SSD instances to be a mixed bag, not quite as good as I hoped, but better than DO for both IOPS and raw throughput.
22:47 semiosis better performance
22:47 eka semiosis: new to gluster though
22:47 semiosis fault tolerance
22:47 devilspgd eka: More spindles (okay, SSDs don't have spindles, but the same applies) typically means better latency and overall throughput...
22:47 eka semiosis: do you have any link to read about this kind of setup?
22:47 semiosis hmm
22:47 semiosis @canned ebs rant
22:47 glusterbot semiosis: http://community.gluster.org/q/for-amazon-ec2-should-i-use-ebs-should-i-stripe-ebs-volumes-together/
22:48 semiosis that probably doesnt work
22:48 semiosis @forget canned ebs rant
22:48 glusterbot semiosis: The operation succeeded.
22:48 devilspgd BUT, in the case of Linode, I don't think I'd bother creating a bunch of virtual drives, you'll be on the same physical storage anyway, so I think with Linode specifically you're probably better off using a single large drive.
22:48 eka devilspgd: any link for this setup?
22:48 eka so I can start reading?
22:49 devilspgd But there might be other considerations.
22:49 semiosis also need to consider growth.  you can easily expand underlying brick fs beneath gluster, but adding more bricks to a volume is risky.  people have had problems rebalancing
22:49 devilspgd eka: Not really, I mostly just discovered as I went. On Linode too, incidentally.
22:49 eka ok
22:49 semiosis best is to play around with gluster & try stuff out, see how it goes
22:49 daMaestro joined #gluster
22:49 eka devilspgd: so what should I read on gluster docs?
22:51 devilspgd Umm, all of them? Heh. I'm not sure, I just followed a getting started guide and a guide on the net that turned out to be wrong.
22:51 devilspgd I'm glad I did, you learn more being wrong than right :)
22:51 eka devilspgd: :D
22:52 devilspgd To be clear, the quick start guide or whatever from Gluster was good though.
22:52 eka ok
22:52 eka thanks guys!
22:52 devilspgd But some of the configuration decisions from some random blog didn't scale well to my environment, so I flattened and started over.
22:52 eka will play with 2 hosts
22:52 devilspgd Enjoy Linode, spin up 6 hosts, try building on 2, add 2 more, practice failing one, etc.
22:52 eka yes the thing is that I have so many files... testing this kind of setup would be difficult
22:53 eka and can't have down time so
22:53 devilspgd Sure, don't mess around like that in live... But before you go live, grab 1000 files and play with them.
22:53 JoeJulian rm -rf / # fixes the "too many files problem
22:53 devilspgd You'll be out about $10 and a bit of time, but the things you'll learn.  ;)
22:54 eka JoeJulian: and /dev/null as file storage and that will be fast as hell... but write only ;)
22:54 JoeJulian :D
22:54 gildub joined #gluster
22:55 JoeJulian Unfortunately, you can't use /dev/null for a brick. It doesn't support extended attributes.
22:55 eka lol
22:55 T3 joined #gluster
22:56 devilspgd Write-only storage is surprisingly reliable, never unexpectedly lost a bit of data yet.
22:57 eka does gluster needs lotsa CPU?
22:57 JoeJulian sometimes
22:57 eka I see
22:57 JoeJulian During self-heals especially.
22:57 eka with NFS the load avg was on the roof
22:57 eka and that was killing my app
22:57 JoeJulian never used nfs, so I couldn't say.
22:58 eka so I will test some clusters
22:58 JoeJulian Possibly the cache pushing you into swap?
22:58 eka dunno really
23:00 eka semiosis: what were you using in AWS... wasn't EBS killing your performance?
23:01 purpleidea joined #gluster
23:01 semiosis eka: load avg on nfs server usually means your disk can't keep up, afaik
23:01 semiosis eka: so moving to gluster with several bricks probably a good idea :)
23:02 eka yes will try that
23:02 semiosis i use EBS (magnetic, there wasn't ssd until recently)
23:02 semiosis performance is great
23:02 eka bricks can be added on the fly?
23:02 semiosis if you add bricks to a volume you need to rebalance, that's the hard part
23:02 semiosis it's intensive and, unfortunately, error prone
23:02 eka ouch
23:02 devilspgd eka: In theory, sure. In practice, if you're doing anything but a full replica, it seems to be less than super reliable.
23:02 devilspgd BUT... Again, you're on Linode.
23:03 semiosis so what i like to do is start out with a lot of small bricks that can be expanded over time.  ebs allows up to 1TB per volume, so I started with 256GB volumes and expanded them
23:03 devilspgd You have 2 boxes devoted to Gluster, you don't want to re-balance, fine.
23:03 devilspgd Spin up 3 Linodes, build a new gluster, rsync the data (repeatedly until it's up to date in real time) and flip the clients over.
23:04 devilspgd Do it over private IPv4 IPs and the bandwidth is free, it just takes time.
23:04 eka devilspgd: was thinking just that :D
23:04 devilspgd Kill the old two, and you'll just end up spending a few $ for the time when you overlap.
23:04 semiosis if you can afford it you really should use replication on gluster.  if you lose a server your clients can keep working
23:04 semiosis if you have many writer clients then you might want to consider replica 3 with quorum
23:04 devilspgd semiosis: Agreed... I was just gonna say that. Definitely use replication.
23:05 eka I see... all clients write
23:05 devilspgd Linode has had host failures in the past.
23:05 devilspgd Oh... And don't use Linode's backups, or at least, don't rely on them.
23:05 semiosis also, replication helps with reads, since clients will read from all replicas
23:05 devilspgd They do not backup extended attributes.
23:05 semiosis ouch
23:05 devilspgd Nor does Linode's backups or filesystem tools support xfs, just ext3/4.
23:06 semiosis ebs was great for backups.  could snapshot a whole server with all attached disks.  restoring took a couple clicks and a couple minutes
23:06 purpleidea joined #gluster
23:07 devilspgd If you use ext4 a Linode backup is better than nothing, but you can't just restore and expect things to work because of the loss of extended attributes. Still, you won't lose user data, just sanity if you need to restore.
23:07 devilspgd Much better to build your own recovery plans in this particular case
23:08 semiosis have i mentioned recently just how great my experience running gluster on aws has been?  really great!
23:08 semiosis they dont even pay me to say that
23:09 devilspgd semiosis: Nice :)
23:10 eka semiosis: cool... but client is using linode :P
23:13 devilspgd I've generally had better results using Linode than Amazon for virtual servers, but I like Amazon for S3 storage, stuff like that.
23:14 devilspgd There's nothing wrong with Amazon's virtual servers, just from experience, I get more consistent performance from Linode.
23:21 klaas joined #gluster
23:21 semiosis joined #gluster
23:21 PeterA joined #gluster
23:21 ghenry joined #gluster
23:21 ghenry joined #gluster
23:21 mdavidson joined #gluster
23:21 gothos joined #gluster
23:21 semiosis joined #gluster
23:21 julim joined #gluster
23:21 georgeh_ joined #gluster
23:21 James joined #gluster
23:21 R0ok_ joined #gluster
23:22 radez joined #gluster
23:22 cfeller joined #gluster
23:22 plarsen joined #gluster
23:22 kovshenin joined #gluster
23:22 fubada joined #gluster
23:22 jvandewege joined #gluster
23:22 dblack joined #gluster
23:22 ckotil joined #gluster
23:22 mtanner joined #gluster
23:22 primusinterpares joined #gluster
23:22 codex joined #gluster
23:22 guntha_ joined #gluster
23:22 juhaj joined #gluster
23:22 jiffe joined #gluster
23:22 tru_tru joined #gluster
23:23 devilspgd joined #gluster
23:25 devilspgd joined #gluster
23:25 capri joined #gluster
23:26 blkperl joined #gluster
23:26 purpleidea joined #gluster
23:31 saltsa joined #gluster
23:31 klaas_ joined #gluster
23:31 purpleidea joined #gluster
23:31 purpleidea joined #gluster
23:34 isadmin1 joined #gluster
23:36 iPancreas joined #gluster
23:37 isadmin1 joined #gluster
23:40 kaii joined #gluster
23:41 jaank joined #gluster
23:52 corretico joined #gluster
23:55 Gugge joined #gluster
23:56 elico joined #gluster
23:57 lkoranda joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary