Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-10-25

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 allgood just noted... not the same setup
00:00 calisto joined #gluster
00:00 allgood the one that worked is linux 3.2
00:00 allgood this one is 3.16
00:02 n-st joined #gluster
00:05 allgood upgrade from 3.2 to 3.5 ... should it be difficult?
00:05 JoeJulian No, but it will require downtime.
00:05 JoeJulian @upgrade notes
00:05 glusterbot JoeJulian: I do not know about 'upgrade notes', but I do know about these similar topics: '3.3 upgrade notes', '3.4 upgrade notes'
00:05 JoeJulian @3.3 upgrade notes
00:05 glusterbot JoeJulian: http://vbellur.wordpress.com/2012/05/31/upgrading-to-glusterfs-3-3/
00:05 allgood kernel is the same... just stopped the glusterfs-server and restarted
00:05 allgood got this
00:06 allgood # glusterfs
00:06 allgood [2014-10-25 00:04:34.462493] C [glusterfsd.c:1445:parse_cmdline] 0-glusterfs: ERROR: parsing the volfile failed (No such file or directory)
00:06 allgood no problem with this setup... it is empty right now
00:07 JoeJulian A directory moved: /etc/glusterd -> /var/lib/glusterd
00:07 allgood reading now
00:08 JoeJulian rpms handled that directory move for us, you guys have to move it by hand.
00:08 allgood great... worked
00:09 allgood the mount... lets see the mkfs
00:09 allgood thank you JoeJulian
00:10 JoeJulian You're welcome. :)
00:11 allgood is there any info on direct access to glusterfs from xen? without the context switches?
00:17 JoeJulian allgood: I haven't seen anything from the xen devs on that, no.
00:17 JoeJulian I'm really not sure why...
00:22 msmith joined #gluster
00:31 calisto joined #gluster
00:45 tryggvil joined #gluster
00:50 rjoseph joined #gluster
00:56 allgood joined #gluster
01:05 allgood left #gluster
01:23 MrAbaddon joined #gluster
01:33 sithik joined #gluster
01:34 sithik kkeithley - Are you still around? I was talking to you earlier about gluster and the time-out issue.
01:37 sithik In general I'll ask as well for anyone around. I have 2 bricks setup (I'd assume that means both are servers technically?) with the volume being created like so:  gluster volume create volume1 replica 2 transport tcp 10.132.1.1:/gluster-storage 10.132.1.2:/gluster-storage force   I understand in a server/client setup you'd obviously have a hang/time-out issue if the server went down, but since
01:37 sithik both are servers, just replicating each other isn't that the point of replicating? Neither need to be online for the other to work. By "work" I mean if one went down, why exactly does it hang if I try reading the directory?
01:39 T0aD joined #gluster
01:45 JoeJulian sithik: It only hangs if the client doesn't know that the server has gone down. If the server was shut down (like for maintenance) then the TCP connection is closed and a user using the client mount won't even know it happened.
01:45 JoeJulian If TACC trips over a network cord and fumbles around with it for 30 seconds, the client will patiently wait for that connection to come back and will continue on normally.
01:45 sithik So, even though I didn't install fuse (literally just glusterfs-server) on both machines, and they are replicating, one is still considered a client?
01:45 JoeJulian fuse is part of the kernel.
01:45 JoeJulian When you mount the volume, ie. mount -t glusterfs ... it's mounting via fuse.
01:45 sithik Hmm, I swear I had to specifically install fuse when I tried it in a server/client setup.
01:45 JoeJulian the client is a dependency of the server.
01:45 JoeJulian So if you installed a client, then you did have to install the glusterfs-fuse package.
01:45 sithik So I'll ask for your suggestion. Ultimately I have 2+ nginx/php-fpm machines that obviously need to read the same set of files, have the same set of wp-uploads, etc. It kinda throws redunancy out the window, making load-balancing SOMEWHAT pointless as well, if anytime any of the machines go down the others will hang until the timeout is reached.... no?
01:45 JoeJulian @ping-timeout
01:45 glusterbot JoeJulian: The reason for the long (42 second) ping-timeout is because re-establishing fd's and locks can be a very expensive operation. Allowing a longer time to reestablish connections is logical, unless you have servers that frequently die.
01:46 T0aD joined #gluster
01:47 sithik So having two bricks, that simply replicate each other, one is still considered a "client"? Even when I setup a server and 2 clients (that literally just mounted the server's mount over the network connection and didn't store ANYTHING on the actual client machine) I SWEAR I still had the timeout issue, which is why I tried the replication approach thinking if both machines stored identical
01:47 sithik copies it wouldn't have that issue.
01:48 JoeJulian When you mount a volume, that mount is a client.
01:48 sithik Hmm, so essentially I'm running a server AND a client on each machine. Gotcha.
01:48 JoeJulian right
01:50 sithik So this makes zero difference. With the replication approach I did    localhost:/volume1   /storage-pool   glusterfs defaults,_netdev 0 0  in my /etc/fstab. When I did the one server/two client approach I mounted the clients to the server via  10.128.94.102:/volume1 /storage-pool glusterfs defaults,_netdev 0 0  in /etc/fstab. I guess I haphazardly assumed being the current approach
01:50 sithik (localhost:/volume1) would have alleviated that hang issue. :(
01:51 sithik So there's no real way to (even if I needed a third machine being the actual "master" that stored all the files) have any number of clients connected to it and if one went down, the others just happily churn along with no hiccup?
01:53 sithik To make the question extremely simple "how does google/facebook continue to chug along nicely if a machine goes offline"? I guess it is a little different when you're using hardware RAIDed machines where disk failures (equivalent to a network time-out) can keep running seamlessly. :P
02:38 ira joined #gluster
02:47 JoeJulian sithik: sorry, dinner called... Facebook - everything goes through load balancers and memcache. If a server goes offline, it's kicked out of the load balancer. They don't use hardware raid, in fact they use (spearheaded the development of) open compute hardware.
02:48 JoeJulian They install the application directly to the application servers using a configuration management tool they call Tupperware and they have enough hardware they can roll that out rack by rack.
02:49 haomaiwang joined #gluster
02:49 JoeJulian The content is stored on commodity drives in huge sequential files. Every image is appended to the end of some file on some disk (multiple servers of course) with index servers keeping track of which disk, offset, and length.
02:51 JoeJulian When a disk fails, that disk is kicked out of the database of available storage and a light lights up on the tray and disk. tacc replaces the disk and closes the ticket, at which point their automation tools format the disk and add it back into the database.
02:52 JoeJulian I'm less clear of the details for google storage.
02:52 JoeJulian Facebook, by the way, does use gluster for some things, but they haven't shared what.
03:02 sithik Sorry, was figuring out how I want to handle this. lol
03:04 sithik I mean, essentially without wanting to have that time-out hang or lowering the time-out to something absurd like 1 second, I guess the best approach would be just to have identical servers (Wordpress doesn't write to any file directly so that's not the issue) and then just have wp-uploads go directly to S3 since I'm using cloudfront anyway. That would keep everything churning I believe regardless
03:04 sithik of whether a server goes down or not.
03:04 sithik I got load-balancing and everything else covered, but gluster seemed like it could have made life super easy for multiple things instead of getting into salt, puppet, etc, etc, etc.
03:06 JoeJulian If you're hitting ping-timeout more than once every couple of years, you're doing it wrong.
03:10 sithik I mean you're totally right, but I have 0 users (shiiiit didn't even get to working on the wordpress site yet) but I like future-proofing/proactively seeing issues. Hence why I'm going haproxy -> varnish -> multiple nginx/php-fpm -> memcached -> mysql. lol
03:10 sithik I guess I just seen "zomg if master goes down there's going to be a 42 second hang of the site, which shiiiit varnish should be happily serving for non-cookied users but still". heh
03:10 JoeJulian Good! I like it when engineers realize the importance of putting the cache as close to the user as possible.
03:11 sithik TBCH, I'm unsure if that's supposed to be snarky. lol
03:11 JoeJulian Don't beat yourself up, though. Keep in mind a reasonable SLA and target that.
03:11 JoeJulian Not snarky at all. A lot of people screw that up.
03:11 ira joined #gluster
03:12 JoeJulian It's frustrating watching people try to squeeze performance out of rust when they could have served something from ram.
03:19 sithik JoeJulian - Just to throw one last piece of mud at gluster and the whole time-out once every couple of years comment. I'd have those time-out issues if ANY of the machines that had nginx/php-fpm on them crashed and I had to reboot ANY of them. That's my frustration with the time-out issue primarily and I'm fairly certain it's terrible to put the time-out to say... 5 or 10 seconds, right? :P
03:21 JoeJulian It's probably not terrible.
03:22 JoeJulian Just be aware that it's when the connection is re-established that's the issue and watch your load and logs. If they're timing out during reconnection, you may cause yourself a race condition. As long as you're aware, tune to your heart's content.
03:22 sithik Hmmm true true.
03:23 JoeJulian Worst case, you screw it up and have to put it back... :D
03:23 JoeJulian Well... worst case it causes a split-brain...
03:23 JoeJulian then you have to figure out which copy of some file is worth keeping.
03:24 haomaiwang joined #gluster
03:25 sithik TBCFH I should probably just keep it running like I have it since it'd be a terrible coincidence to have varnish cache stale out within those 42 seconds time-out. dskfjdskljfkdsj
03:26 sithik If it was you, and you didn't want to worry about appservers hang for 42 seconds (i.e. your site not being accessible regardless of how many appservers you have) how would you do it? Would you just pray varnish didn't stale in those 42 seconds or would you go a different approach all together?
03:28 JoeJulian I guess it depends on what kind of service I'm offering and what kind of SLA I have to perform to.
03:28 JoeJulian In most cases, I've left ping-timeout alone.
03:30 sithik And just to be sure, this would be the correct setup for wanting to have multiple machines all have local versions of the files while keeping changes done on any machine in sync right?  gluster volume create volume1 replica 2 transport tcp 10.132.1.1:/gluster-storage 10.132.1.2:/gluster-storage force
03:32 JoeJulian That would do it.
03:33 JoeJulian caveats: I don't like putting dynamic data on the root filesystem. If I could only have the one partition, I'd make a loopback file and mount it.
03:34 sithik You know, I read about putting it on root filesystem and can you answer in one sentence why that's always considered a terribad idea?
03:34 JoeJulian When your root partition gets full and you cannot ssh in to fix it.
03:34 sithik I have VNC access.
03:34 JoeJulian Sometimes you can't even login
03:35 JoeJulian I always put logs and user data off root.
03:35 sithik ACK! Something else for thought now. Teheh, I kid.
03:35 JoeJulian In some cases, /var/lib also.
03:37 sithik Well... I sincerenly thank you for your time. I know it's valuable and truly appreciate the chat.
03:38 JoeJulian Good luck.
03:41 JoeJulian ... I wish someone would tell my wife how valuable my time is...
03:45 sithik Oh JoeJulian, one last question I promise. Whenever I fire up additional boxes how do I "add" them as bricks? If   gluster volume create volume1 replica 2 transport tcp 10.132.203.135:/gluster-storage 10.132.203.136:/gluster-storage force   is used to form the initial volume, peer-probing doesn't "magically" add them to the replication does it?
03:56 sithik-alt joined #gluster
03:56 sithik-alt blah, don't know if you responded.
04:00 chirino joined #gluster
04:04 JoeJulian sithik-alt: No. To add more servers you will use the "gluster volume add-brick" command. Since you're going to have two replicas,you'll add bricks in pairs.
04:04 haomai___ joined #gluster
04:04 JoeJulian typically, then, you'll also want to rebalance.
04:05 sithik-alt You know I'm reading this, and I don't think I want this setup techincally. If each machine has 20gb of storage, the brick technically has 20gb X N bricks, and obviously you couldn't have 40gb on a 20gb disk.
04:05 sithik-alt Grr, so confused.
04:07 JoeJulian Right. Then you add your next pair of servers and you have 40Gb with 80Gb of disk...
04:08 sithik-alt Yea, that's what I was concerned with. Technically doing it this seemingly nice way means I have to scale by 2 everytime.
04:09 JoeJulian @lucky expanding a glusterfs volume by one server
04:09 glusterbot JoeJulian: http://gluster.org/community/documentation/index.php/Gluster_3.1:_Expanding_Volumes
04:09 JoeJulian hmm, nope
04:10 JoeJulian http://joejulian.name/blog/how-to-expand-glusterfs-replicated-clusters-by-one-server/
04:10 sithik-alt Was going to say, that's what I read about needing to add pairs.
04:10 glusterbot Title: How to expand GlusterFS replicated clusters by one server (at joejulian.name)
04:15 badone joined #gluster
04:17 sithik-alt Well again, thank you so much for your time. I gotta get to bed. Have a good night.
04:17 JoeJulian Goodnight
04:19 rjoseph joined #gluster
04:25 kanagaraj joined #gluster
04:42 bennyturns joined #gluster
04:48 soumya__ joined #gluster
04:56 Telsin joined #gluster
05:35 LebedevRI joined #gluster
05:43 brettnem joined #gluster
05:56 XpineX__ joined #gluster
06:21 Humble joined #gluster
06:55 ctria joined #gluster
07:40 rotbeard joined #gluster
08:26 glusterbot New news from newglusterbugs: [Bug 1157107] String to Floating point conversion failure with client packages during mount <https://bugzilla.redhat.com/show_bug.cgi?id=1157107>
08:38 al0 joined #gluster
08:44 plarsen joined #gluster
08:51 DV joined #gluster
08:57 TrDS joined #gluster
09:04 MrAbaddon joined #gluster
09:07 TrDS hi, i nearly killed a replicated volume by using replace-brick (using gluster 3.5.2), this seems to be a known bug... currently its running again with config files restored from a backup... but now, how do i move a brick from one disk to another? what about add-brick + remove-brick?
09:08 SOLDIERz joined #gluster
09:16 JoeJulian No, replace-brick is the tool for that. What do you mean by "nearly killed"?
09:27 Humble joined #gluster
09:28 ricky-ti1 joined #gluster
09:29 davidhadas_ joined #gluster
09:29 kumar joined #gluster
09:30 m0zes joined #gluster
09:31 shubhendu joined #gluster
09:46 TrDS "nearly killed" in this case means [step 1] replace-brick ... start [step 2] command hangs, timeout(?) after a minute, no message [step 3] every following command also hangs and times out, even vol status and vol info [step 3] kill all gluster daemons and restart gluster [step 4] basic commands work again, but replace-brick abort fails... but this seems to be known, there are bug tickets regarding replace-brick
09:46 DV joined #gluster
09:51 JoeJulian TrDS: Did it not come up with the deprecation warning suggesting to go straight to replace-brick ... commit force? It does that in 3.4.4...
10:01 _NiC joined #gluster
10:02 shubhendu joined #gluster
10:11 TrDS joined #gluster
10:11 TrDS JoeJulian: no, it did not... just hanging for a while, then went back to the prompt
10:11 TrDS (sorry, had to reconnect)
10:13 JoeJulian Hrm.. I wonder why that's not in 3.5.. wierd. Well, that's the new way to do it. Not sure I like it, but there you have it.
10:13 JoeJulian Ok, it's 3:13am... I'm going to bed. Good luck to you.
10:13 TrDS thx
10:14 TrDS so commit force without start?
10:14 JoeJulian right
10:14 TrDS ok, thx
10:14 JoeJulian It swaps out the brick configuration and relies on self-heal to populate the new brick.
10:31 TrDS does this mean that i just lost redundancy and self-heal will copy from the other brick instead of the old one?
10:47 TrDS btw. the new bricks disk usage is still near zero after 30 minutes, a .glusterfs dir was created, but self-heal doesn't seem to copy anything... do i have to stat all files manually to trigger self-heal?
10:49 davidhadas__ joined #gluster
10:56 davidhadas___ joined #gluster
11:06 badone joined #gluster
11:19 haomaiwang joined #gluster
11:21 SOLDIERz joined #gluster
11:22 haomai___ joined #gluster
11:23 davidhadas joined #gluster
11:27 davidhadas__ joined #gluster
11:29 TrDS let's recap... i have started a replace-brick (... commit force) operation... my questions: 1. by doing so, have i lost redundancy until replication is complete again? (i'd consider that to be very bad) 2. do i have to recursively stat every file to trigger self-heal, or will gluster do this itself?
11:29 TrDS i have to leave now, but i'm gonna read the log later, so thanks for all answers
11:30 TrDS left #gluster
11:38 Pupeno joined #gluster
12:04 diegows joined #gluster
12:16 m0zes joined #gluster
12:49 msmith joined #gluster
13:13 Intensity joined #gluster
13:22 harish joined #gluster
13:35 kshlm joined #gluster
13:55 soumya__ joined #gluster
14:09 msmith joined #gluster
14:12 deepakcs joined #gluster
14:26 T0aD joined #gluster
14:44 T0aD joined #gluster
14:56 T0aD joined #gluster
15:04 SOLDIERz joined #gluster
15:20 Pupeno joined #gluster
15:25 kumar joined #gluster
15:30 diegows joined #gluster
15:36 T0aD joined #gluster
15:37 kumar joined #gluster
15:39 msmith joined #gluster
16:26 elico joined #gluster
16:40 msmith joined #gluster
16:44 T0aD joined #gluster
16:52 n-st joined #gluster
16:53 n-st joined #gluster
16:54 kedmison joined #gluster
16:56 Lee- joined #gluster
17:27 glusterbot New news from newglusterbugs: [Bug 1115850] libgfapi-python fails on discard() and fallocate() due to undefined symbol <https://bugzilla.redhat.com/show_bug.cgi?id=1115850> || [Bug 1135016] getxattr and other filesystem ops fill the logs with useless errors (expected EPERM and the like) <https://bugzilla.redhat.com/show_bug.cgi?id=1135016>
17:29 glusterbot New news from resolvedglusterbugs: [Bug 847620] [FEAT] NFSv3 Authorization rpcsec_gss + krb5 (cluster aware credential cache) <https://bugzilla.redhat.com/show_bug.cgi?id=847620>
17:30 msmith joined #gluster
17:46 bennyturns joined #gluster
18:00 glusterbot New news from resolvedglusterbugs: [Bug 822361] Lookup of files with gfid's (created from backend) on nfs mount are not force merged <https://bugzilla.redhat.com/show_bug.cgi?id=822361>
18:44 SOLDIERz joined #gluster
19:44 steven2 joined #gluster
19:44 steven2 Hi, is there a package for CentOS 7 that will provide these scripts? https://github.com/gluster/glusterfs/blob/master/extras/hook-scripts/start/post/S30samba-start.sh
19:44 glusterbot Title: glusterfs/S30samba-start.sh at master · gluster/glusterfs · GitHub (at github.com)
19:45 steven2 I'm currently using this repo http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
19:46 JoeJulian seems unlikely if kkeithley did it right.
19:46 JoeJulian Oh, wait... I don't think he's doing the packaging now... anyway, 7 uses systemd, doesn't it?
19:47 steven2 correct
19:47 JoeJulian Then it shouldn't use an init script.
19:47 JoeJulian /usr/lib/systemd/system/glusterd.service
19:48 msmith joined #gluster
19:48 JoeJulian as for samba...
19:48 steven2 that makes sense, I'm studying for the EX236 and the redhat docs reference those scripts as part of the IP failover section
19:48 steven2 https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.0/html/Administration_Guide/ch09s04.html
19:48 glusterbot Title: 9.4. Configuring Automated IP Failover for NFS and SMB (at access.redhat.com)
19:49 steven2 I guess the heart of my concern is how to properly set this up if I won't have access to the internet during the exam
19:49 JoeJulian nevermind everything I've said. I haven't had coffee today.
19:50 JoeJulian I haven't seen a package with that. Seems like it should be its own package that would require samba.
19:50 steven2 ah ok, the issue I'm having is those directories exist but those files don't
19:51 JoeJulian Cool script. I hadn't looked at it before.
19:53 JoeJulian The hooks tree should exist as it's the method for applying user-defined changes to the volumes. That particular file would fail if samba wasn't installed so I would expect it to be in a glusterfs-samba package if such existed.
19:53 JoeJulian Would probably be valuable to file a bug to that end
19:53 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
19:54 theron joined #gluster
19:54 JoeJulian Also... since that's specific to RHS, they might package that script with their product. I'm not sure.
19:55 steven2 I was thinking the same
19:57 steven2 well, hoping that isn't it
19:58 theron joined #gluster
20:25 Pupeno_ joined #gluster
20:33 SOLDIERz joined #gluster
20:49 msmith joined #gluster
21:40 Pupeno joined #gluster
21:50 msmith joined #gluster
22:11 badone joined #gluster
22:22 SOLDIERz joined #gluster
22:30 badone joined #gluster
22:38 buhman http://sprunge.us/cNbK O.o "possible circular locking dependency detected" sound scary
22:43 SOLDIERz joined #gluster
22:45 msmith joined #gluster
22:56 _Bryan_ joined #gluster
23:02 SOLDIERz joined #gluster
23:07 JoeJulian Nifty. Is that gluster, or is that a kernel thing?
23:08 JoeJulian buhman: I assume you're debugging that so you can file a bug report? (assuming since you're running rawhide that that's what you do)
23:08 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
23:22 buhman JoeJulian: no idea how to reproduce :S

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary