Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2013-11-04

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:02 mattapperson joined #gluster
00:03 rubbs joined #gluster
00:27 rubbs joined #gluster
01:05 mattapperson joined #gluster
01:24 harish joined #gluster
01:40 diegows_ joined #gluster
02:24 lyang0 joined #gluster
03:43 Fresleven joined #gluster
03:49 Fresleven joined #gluster
03:52 glusterbot New news from newglusterbugs: [Bug 1026143] Gluster rebalance --xml doesn't work <http://goo.gl/hVyRoP>
04:01 Fresleven joined #gluster
04:06 saurabh joined #gluster
04:18 ppai joined #gluster
04:22 atul joined #gluster
04:26 mattapperson joined #gluster
04:30 shyam joined #gluster
05:20 bulde joined #gluster
05:27 Fresleven_ joined #gluster
05:30 mohankumar joined #gluster
05:41 mattapperson joined #gluster
05:41 rjoseph joined #gluster
05:44 mohankumar joined #gluster
06:00 mattapperson joined #gluster
06:20 hagarth joined #gluster
06:37 vimal joined #gluster
06:47 vshankar joined #gluster
07:03 mattapperson joined #gluster
07:18 jtux joined #gluster
07:23 mgebbe_ joined #gluster
07:23 mgebbe_ joined #gluster
07:25 DV joined #gluster
07:43 askb joined #gluster
07:54 ctria joined #gluster
07:59 franc joined #gluster
07:59 franc joined #gluster
08:03 ekuric joined #gluster
08:10 eseyman joined #gluster
08:13 keytab joined #gluster
08:23 samppah hagarth: ping
08:25 hagarth samppah: pong
08:27 samppah hagarth: https://bugzilla.redhat.co​m/show_bug.cgi?id=1022961 do you possibly maybe have more information about status of this bug?
08:27 glusterbot <http://goo.gl/lrwAbe> (at bugzilla.redhat.com)
08:27 glusterbot Bug 1022961: urgent, urgent, ---, ewarszaw, ASSIGNED , Running a VM from a gluster domain uses mount instead of gluster URI
08:28 samppah ovirt / rhev specific but you are mentioned in it :)
08:28 hybrid5121 joined #gluster
08:30 hagarth samppah: yeah, it is currently being treated as a blocker for ovrit 3.3.1
08:30 samppah no fix available for rhev either?
08:30 danci1973 Good morning... What is a typical IPoIB overhead? I have 4x SDR which should give about 8 Gbit/s, but using iperf I only see 2.92 Gbits / 3.48 Gbits/sec ...
08:32 hagarth samppah: not yet, it is being treated as a blocker for rhev 3.3 release as well.
08:37 samppah hagarth: ok, thanks
08:47 eseyman joined #gluster
09:02 mattapperson joined #gluster
09:03 X3NQ joined #gluster
09:13 calum_ joined #gluster
09:45 morse joined #gluster
10:37 ccha hi I stop and start a volume and I got this message "Volume id mismatch for brick"
10:39 ricky-ticky joined #gluster
10:44 tziOm joined #gluster
10:47 ccha Expected volume id 08422967-bc7f-4216-8843-d83af44c0ed2, volume id d5ab576b-3a5c-44d2-8308-f4f18c980552 found
10:47 jfitz joined #gluster
10:47 ccha hum how can I find this id ? where this id come from ?
10:51 ccha oh this volume id was from geo-replication source volume id
10:52 ccha the geo-replication doesn't work and as faulty
10:52 NeatBasis joined #gluster
10:57 eryc joined #gluster
11:08 eryc joined #gluster
11:08 eryc joined #gluster
11:16 bulde joined #gluster
11:23 glusterbot New news from newglusterbugs: [Bug 990028] enable gfid to path conversion <http://goo.gl/1HwiQc> || [Bug 969461] RFE: Quota fixes <http://goo.gl/XFSM4> || [Bug 1026291] quota: directory limit cross, while creating data in subdirs <http://goo.gl/hesUtT>
11:34 rcheleguini joined #gluster
11:41 diegows_ joined #gluster
12:14 ppai joined #gluster
12:57 chirino joined #gluster
13:14 ira joined #gluster
13:14 ira joined #gluster
13:15 dewey joined #gluster
13:18 ctria joined #gluster
13:32 B21956 joined #gluster
13:35 haritsu joined #gluster
13:36 haritsu joined #gluster
13:53 Rav_ joined #gluster
13:56 haritsu joined #gluster
14:01 bennyturns joined #gluster
14:03 ctria joined #gluster
14:04 haritsu joined #gluster
14:05 eseyman joined #gluster
14:06 edward1 joined #gluster
14:12 andreask joined #gluster
14:24 squizzi joined #gluster
14:32 ctria joined #gluster
14:44 Technicool joined #gluster
14:45 Technicool oops wrong room to change nick :)
14:55 haritsu joined #gluster
14:55 haritsu joined #gluster
14:57 kaptk2 joined #gluster
15:02 bugs_ joined #gluster
15:16 purpleidea Technicool: cool
15:22 dbruhn joined #gluster
15:24 spechal_ How do you typically mount a glusterfs volume without a single point of failure?  i.e. if I mount 10.0.0.2 which is apart of the cluster and it dies, my servers will no longer mount.  Do you use DNS, HA Proxy, F5, anything?
15:25 wushudoin joined #gluster
15:26 dbruhn If you are using the Gluster Fuse client it is aware of all of the servers in the clust as part of the initial mount operation.
15:34 elyograg spechal_: a simple way to deal with no redundancy at *mount* time is to put software on at least two of your gluster servers so that they have a virtual IP, and use the virtual IP in the mount command.  As dbruhn said, the fuse client already deals with failure after a volume is mounted, because the client downloads the volume info from the server listed in the mount and connects to the whole cluster.  I think you could also use round-robin DNS, bu
15:35 spechal_ Dealing with a failure before cluster connection is what I am looking at, so a virtual/floating IP with a heartbeat seems like a plausible solution
15:37 ndk joined #gluster
15:38 haritsu joined #gluster
15:45 squizzi left #gluster
15:46 lpabon joined #gluster
15:49 dbruhn spechal_ rrdns seems to be the consensus on this one, you'll still have intermittent failure, but only if the rrdns hits the one down server
15:49 dbruhn spechal_ how many brick servers are you planning on
15:50 spechal_ probably only 3 until we feel comfortable with gluster
15:50 spechal_ eventually 7 or so
15:51 monotek joined #gluster
15:51 haritsu joined #gluster
15:56 haritsu joined #gluster
15:57 haritsu joined #gluster
16:22 dbruhn Well obviously the more you add the more useful the rrdns becomes in the situation
16:23 dbruhn spechal_ there are a lot of people with rather large gluster installs on here too, just an FYI
16:34 ndevos dbruhn, spechal_: a common setup uses rr-dns and as many virtual-ips as you have storage servers, if a storage server goes down, an other server holds the vip until the original server is back up
16:36 spechal_ ndevos: how would you allocate/delegate the virtual IPs?
16:43 ndevos spechal_: I've used pacemaker for that, and ctdb too - either works
16:43 Alex +1 on corosync/pacemaker
16:44 ndevos spechal_: if you export over samba, you need ctdb anyway (for locking in the samba clustering), otherwise I'd go the pacemaker route
16:45 spechal_ I guess what I am asking is, are you taking the IP that the servers use to mount to and using that as a virtual IP floating between the other machines?
16:48 ndevos no, I'd give each server a virtual/floating-ip, and have rr-dns rotate through those - use the dns virtual hostname for mounting
16:49 chirino joined #gluster
16:49 ndevos nfs clients will get balanced that way, and an outage of one server just moves the IP around, so that rr-dns still can be used as-is to resolve the hostname and mount a random server that is available
17:02 zerick joined #gluster
17:04 noob21 joined #gluster
17:05 sjoeboo joined #gluster
17:05 noob21 has anyone else noticed the self heal daemon in 3.4.1 is less stable than 3.3.1?  it seems to crash much more often
17:08 aliguori joined #gluster
17:15 zerick joined #gluster
17:23 t35t0r joined #gluster
17:23 t35t0r joined #gluster
17:26 spechal_ I am trying to create a replica set that I can add and remove nodes as needed.  I followed the quick setup instructions, but it appears I am now limited to adding 2 replica nodes at a time.  If I create the volume with replica 1, instead of replica 2, can I add and remove nodes one at a time?  If not, is there a way to resize the number of bricks for a volume so I can add single nodes?
17:27 spechal_ What I am looking to do is expand and contract the environment by one replica node as needed
17:30 Mo__ joined #gluster
17:35 rotbeard joined #gluster
17:43 noob21 left #gluster
17:43 JoeJulian volume add-brick <VOLNAME> [<stripe|replica> <COUNT>] <NEW-BRICK> ... - add brick to volume <VOLNAME>
17:43 JoeJulian spechal_: ^
17:45 johnbot11 joined #gluster
17:46 bosszaru joined #gluster
17:47 spechal_ The problem I am facing is when running the command: Incorrect number of bricks supplied 1 for type REPLICATE with count 2 ... I believe this is because when I created the volume it was with replica 2
17:47 JoeJulian What command are you running?
17:48 spechal_ gluster volume add-brick gluster 10.2.181.233:/gluster
17:49 JoeJulian Is there anything else in the help text I posted that looks like it might be useful in changing the replica count?
17:49 spechal_ Yes, I am working on bringing back my VM to work with variations on your optional arguments
17:50 bosszaru had a brick volume fill up on a 3.2.5 server, added two new bricks(remote block storage), but when I try to kick off rebalance migrate data or fix-layout  it fails with log message http://pastebin.com/MiHw0VA4 and there's nothing in the troubleshooting guide for this
17:50 glusterbot Please use http://fpaste.org or http://paste.ubuntu.com/ . pb has too many ads. Say @paste in channel for info about paste utils.
17:50 JoeJulian Are you changing the replica count to handle read-load that's exceeding your two server's capacity?
17:50 bosszaru oops on http://fpaste.org/51503/38358744/ instead
17:50 glusterbot Title: #51503 Fedora Project Pastebin (at fpaste.org)
17:51 spechal_ I am new to bluster and evaluating it.  I am looking to achieve a way to add and remove replicas as needed.  I had 2 nodes and was looking at add a 3rd
17:51 JoeJulian bosszaru: What version is this?
17:51 bosszaru 3.2.5
17:52 JoeJulian gah, you said that already...
17:52 JoeJulian @replica | spechal_
17:52 JoeJulian ~replica | spechal_
17:52 spechal_ I thought you were talking to be and not bosszaru
17:52 glusterbot spechal_: Please see http://goo.gl/B8xEB for replication guidelines.
17:54 JoeJulian bosszaru: That's one single server then?
17:54 bosszaru yes, I'm testing it out and fortunately it filled up (good scenario)
17:55 JoeJulian It's saying that the connection was refused to all bricks. Normally I would point to a firewall, but since the server resolves to localhost, that's not likely.
17:55 bosszaru I turned off iptables just to be sure
17:55 JoeJulian Are all four glusterfsd running?
17:56 dbruhn bosszaru, is the partition that contains /var/lib/glusterd one of the ones that filled up?
17:56 bosszaru yes for blok
17:56 bosszaru sorry yes, one for each block
17:57 bosszaru the partition is one of the block partitions that filled up
17:57 bosszaru /dev/xvdd1      750G  747G  3.2G 100% /rbs/disk_01
17:58 bosszaru thanks for the help
17:58 JoeJulian netstat -tlnp | grep glusterfsd
17:58 bosszaru tcp        0      0 0.0.0.0:24011           0.0.0.0:*               LISTEN      10841/glusterfsd
17:58 bosszaru tcp        0      0 0.0.0.0:24012           0.0.0.0:*               LISTEN      10845/glusterfsd
17:58 bosszaru tcp        0      0 0.0.0.0:24013           0.0.0.0:*               LISTEN      8020/glusterfsd
17:58 bosszaru tcp        0      0 0.0.0.0:24014           0.0.0.0:*               LISTEN      8024/glusterfsd
17:58 JoeJulian See which ports the bricks are listening on and ensure you can telnet to them.
17:59 bosszaru can't telnet to any of them
17:59 bosszaru oops
18:00 JoeJulian Since they're running and listening, something else is preventing that.
18:00 bosszaru that's not right
18:00 bosszaru oops need to do the port not the pid
18:00 JoeJulian heh
18:00 bosszaru that worked much better
18:01 bosszaru all are listening/responding to telnet
18:01 JoeJulian Wait!! You said you're testing?
18:02 JoeJulian You're not in production?
18:02 bosszaru yes, I'm just testing, but I'm happy I found something I did wrong
18:03 bosszaru good chance to lear
18:03 bosszaru n
18:03 JoeJulian But if you're still in testing, you need to get on a current version, not that antiquated one. ,,(latest)
18:03 glusterbot The latest version is available at http://goo.gl/zO0Fa . There is a .repo file for yum or see @ppa for ubuntu.
18:03 davidbierce joined #gluster
18:04 bosszaru roger!  Will give that a try first
18:04 bosszaru thanks
18:05 JoeJulian You're welcome
18:17 spechal_ JoeJulian: I've used your suggested command as well as found the same command on google, but it fails with: http://fpaste.org/51514/89063138/
18:17 glusterbot Title: #51514 Fedora Project Pastebin (at fpaste.org)
18:21 elyograg does 3.2 support changing the replica count?
18:21 spechal_ JoeJulian: I resolved the issue by updating to the latest gluster
18:21 spechal_ Thanks for your patience
18:21 Keebs joined #gluster
18:24 Keebs Hello all. I'm a newbie to gluster and still reading/learning, but, in short, can gluster replicate to/recover from S3?
18:32 spechal_ Can someone point me in the right direction to resolve "volume sync: failed: sync from localhost not allowed"?  I am trying to sync data to a brick after adding it
18:42 saurabh joined #gluster
18:43 TDJACR joined #gluster
18:50 semiosis spechal_: what do you think sync does?
18:50 semiosis or rather, what are you trying to do?
18:51 semiosis Keebs: no
18:51 spechal_ I had a two brick replica and I added a third, I am trying to get the data onto it
18:51 semiosis spechal_: try volume heal full
18:51 spechal_ I would think sync would sync data from the target to the destination
18:51 semiosis volume sync is to sync volume configuration data between gluster servers
18:51 semiosis not actual brick data
18:51 spechal_ that would be my misunderstanding
18:51 spechal_ thank you
18:52 semiosis yw
18:53 brieweb_ joined #gluster
18:55 spechal_ While it is healing, when I issue volume heal <volume> info ... shouldn't I see the number of entries to be healed or do I misunderstand that too?  I currently see 0 entries on all three servers but looking at the disk free on server 3, it is healing
19:00 brieweb_ Is there anyone who lives close to Davis, CA who would like to talk about gluster at one of our upcoming meetings? This is for the Linux User Group of Davis.
19:00 bosszaru upgrading from 3.2.5 to 3.4.1 causes volumes to go missing.  Is there a way to import a volume from an older version.  brick data is still on the disk with the vol info files
19:06 spechal_ I just did a similar upgrade and I didn't lose my volumes or bricks ... I can't help because I am new to GlusterFS, but that's my experience.  I upgraded about half an hour ago
19:07 nasso joined #gluster
19:08 bosszaru all the volume files are on the disk etc, so I can't imagine what happened..
19:08 spechal_ I stopped the service on all boxes, did a yum update against the glusterfs repo, started the services back up and I haven't had an issue ... just added another replica and am healing now
19:09 spechal_ Did you update from a repo or build from source or what?
19:09 bosszaru from the ppa
19:09 spechal_ did you verify your config files weren't overwritten by apt?
19:10 SpeeR joined #gluster
19:11 bosszaru no files modified in /etc/glusterd
19:16 kPb_in joined #gluster
19:16 bosszaru oh-ho
19:16 bosszaru the upgrade reads from /var/lib/glusterd/vols/
19:18 bosszaru /etc/init.d/glusterfs-server stop && cp -R VOLUMENAME /var/lib/glusterd/vols/ && /etc/init.d/glusterfs-server start
19:18 bosszaru they're back :)
19:19 TDJACR joined #gluster
19:23 Keebs semiosis, is there an s3 plugin/translator, and if so, what is it's purpose then? I keep reading about it in forums, but not able to find any info on it.
19:27 chirino joined #gluster
19:36 semiosis Keebs: links please?
19:37 semiosis brieweb_: johnmark can probably help you with that
19:49 zaitcev joined #gluster
19:49 brieweb_ semiosis: thanks. I will see if I can ping him.
19:58 spechal_ good job bosszaru
20:02 bosszaru lsof is one of my truly best friends
20:19 badone joined #gluster
20:27 diegows_ joined #gluster
20:37 Fresleven_ joined #gluster
20:38 Fresleven_sysadm joined #gluster
20:48 pdrakeweb joined #gluster
20:50 johnbot11 joined #gluster
20:54 Keebs semiosis, here is one: https://forums.aws.amazon.com/thread.jspa​?start=15&amp;threadID=13786&amp;tstart=0# look for dwmike's post on 02/21/2007
20:54 glusterbot <http://goo.gl/Mf2ptR> (at forums.aws.amazon.com)
20:55 glusterbot New news from newglusterbugs: [Bug 1023191] glusterfs consuming a large amount of system memory <http://goo.gl/OkQlS3>
20:58 semiosis Keebs: first i've ever heard of it.  i've been using gluster in ec2 (without S3) since 2010, version 3.1
20:59 semiosis afaik there's no S3 integration & no plan for it currently
21:05 Keebs semiosis, ty
21:07 P0w3r3d joined #gluster
21:20 Fresleven joined #gluster
21:30 badone_ joined #gluster
21:47 cjh973 joined #gluster
21:48 cjh973 gluster: my self heal daemon is logging a log of readv error's.  I'm not sure what they mean
21:53 elyograg my volume rebalance, which looked on track to take forever, has failed after 1.5TB.
21:55 elyograg going to the rebalance log, this is what would fit on the screen with the putty window maximized.  http://fpaste.org/51579/83602049/
21:55 glusterbot Title: #51579 Fedora Project Pastebin (at fpaste.org)
21:56 elyograg that's the end of the log.
22:00 elyograg if I grep for just errors, here's that output from the end of the log. http://fpaste.org/51583/38360234/
22:00 glusterbot Title: #51583 Fedora Project Pastebin (at fpaste.org)
22:07 elyograg searching for 'gluster rebalance "setxattr dict is null"' shows only bug 859387 ... which is closed and has a fixed version of glusterfs-3.3.0.5rhs-40.  I'm on 3.3.1.
22:07 glusterbot Bug http://goo.gl/e6KOZ2 medium, high, ---, sgowda, CLOSED ERRATA, [RHEV-RHS] Rebalance migration  failures are seen when replicate bricks are brought down  and restarted
22:07 elyograg nothing was taken down or restarted.
22:08 elyograg that i know of, at least.
22:08 johnmwilliams joined #gluster
22:09 haritsu joined #gluster
22:16 elyograg all servers (4 with bricks and 2 without) show five peers that are all connected.
22:16 elyograg no server restarts, nobody's logged in and done anything manual.
22:18 elyograg gluster volume info: http://fpaste.org/51595/38360346/
22:18 glusterbot Title: #51595 Fedora Project Pastebin (at fpaste.org)
22:18 fyxim joined #gluster
22:18 mattapperson joined #gluster
22:19 JoeJulian elyograg: On your first brick, is /newscom/mdfs/AKG/akgphotos/docs/224/049 a file or a directory?
22:19 elyograg the filename says it will be a file.  i'll verify.
22:20 elyograg yes, it's a file.  with 777 permissions.  I hope that's not common.
22:21 JoeJulian hmm, if I had time I'd read through the code and see what can cause "Failed to get node-uuid"
22:21 JoeJulian Unfortunately, I'm doing disaster recovery today...
22:22 JoeJulian On stuff I told him I needed additional equipment for, too. Grr.
22:22 elyograg one of the files in that dir (224) has the ---------T permissions on the fuse mount and isn't readable.  i really hope things aren't going to be bad here.
22:23 JoeJulian You're just replica 2 right?
22:23 elyograg yep.
22:23 JoeJulian I'm the one that usually catches the wierd stuff with my replica 3...
22:24 SpeeR joined #gluster
22:24 Remco Possibly https://bugzilla.redhat.com/show_bug.cgi?id=928631
22:24 glusterbot <http://goo.gl/3Xruz> (at bugzilla.redhat.com)
22:24 glusterbot Bug 928631: urgent, high, ---, kaushal, CLOSED CURRENTRELEASE, Rebalance leaves file handler open
22:24 elyograg the volume info I pasted - the first 16 bricks were there, I added the other 16 on 10/28 after 9 PM and started the rebalance then.
22:25 haritsu joined #gluster
22:26 elyograg this file has every warning and error from my rebalance log.  the file inzide the zip is 1.7GB, the whole logfile (not given here) is 2.6GB.  https://www.dropbox.com/s/f6iwhpmex4px​xpn/rebalance-errors-warnings-mdfs.zip
22:26 glusterbot <http://goo.gl/pGBHSn> (at www.dropbox.com)
22:30 Technicool joined #gluster
22:33 elyograg joined #gluster
22:35 elyograg i guess my network hardware at home rebooted.
22:35 JoeJulian Ewww...
22:36 elyograg I need new UPS batteries.  One UPS has new batteries, it runs my server.  the other runs the rest of the hardware and isn't quite so reliable. ;)
22:37 elyograg i saw nothing after putting up my dropbox link.
22:43 cjh973 is there an option to make the self heal daemon multithreaded?
22:43 cjh973 healing a 28T volume is taking way too long
22:43 mattapperson joined #gluster
22:47 cjh973 i should say increase the amount of threads the shd has?
22:49 mattapperson joined #gluster
22:52 mattapperson joined #gluster
22:52 elyograg any ideas on my problem would be appreciated.  finding and fixing the problems I can see in an automated way would be useful.  Two bigt ones I can see: permissions gone wrong, files gone unreadable off the fuse mount with ---------T permissions but still around on one of the new bricks.
22:53 elyograg this is part of a directory listing via fuse: http://fpaste.org/51609/36049751/
22:53 glusterbot Title: #51609 Fedora Project Pastebin (at fpaste.org)
22:53 rwheeler joined #gluster
22:53 elyograg 777 is probably the original permission. 000 makes them unusable.
22:54 haritsu joined #gluster
22:57 elyograg bug 928631 that Remco mentioned says the fix should be in 3.3.2, is an upgrade in my future?  Should I avoid doing the upgrade twice and go straight to 3.4.1?  There's too much data for us to make backups.
22:57 glusterbot Bug http://goo.gl/3Xruz urgent, high, ---, kaushal, CLOSED CURRENTRELEASE, Rebalance leaves file handler open
22:58 mdjunaid joined #gluster
22:58 fyxim joined #gluster
22:59 bivak joined #gluster
22:59 DV joined #gluster
23:02 mattapperson joined #gluster
23:07 rwheeler joined #gluster
23:14 nasso joined #gluster
23:14 bivak joined #gluster
23:20 bosszaru1 joined #gluster
23:26 haritsu joined #gluster
23:50 Skaag joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary