Camelia, the Perl 6 bug

IRC log for #gluster, 2013-03-28

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:26 Han joined #gluster
00:34 theron joined #gluster
00:59 yinyin_ joined #gluster
01:00 robo joined #gluster
01:07 jules_ joined #gluster
01:30 yinyin joined #gluster
01:31 glusterbot New news from newglusterbugs: [Bug 928575] Error Entry in the log when gluster volume heal on newly created volumes <http://goo.gl/KXsmD>
01:32 dumbda joined #gluster
01:32 nueces joined #gluster
01:32 dumbda when creating volume gluster tells me that the server i am creating volume from is not a friend.
01:32 dumbda how can i resolve it?
01:37 robinr dumbda; have you done gluster peer probe ?
01:37 dumbda yes
01:38 dumbda it tells me not a friend for a host i am creating volume from
01:38 robinr do you have CNAMEs in the DNS names used ?
01:38 dumbda my hostname is on ubuntu serverfile1
01:38 dumbda in /etc/hosts
01:39 dumbda i have 10.10.2.3 serverfile1.domain.com
01:39 robinr hmm.. just for testing, you can do gluster peer detach NAME and then use IPs
01:39 robinr this will rule certain things out..
01:40 robinr funny, i ran into the exact same problems today. my dns names used are CNAMEs and i was getting those errors.
01:40 bala joined #gluster
01:41 robinr i don't know the reason behind it; but switching to IP fix my problems. that said, using IPs are generally not a good idea.
01:41 dumbda no not really a cname
01:41 dumbda for me
01:41 dumbda a record
01:41 dumbda yeah i am using aws
01:41 robinr yeah; did switching to IP addresses work for you ?
01:41 dumbda so ips are chnging
01:42 dumbda well so in my case i have 2 servers
01:42 dumbda with 2 elastic ips
01:42 dumbda so i want relly use dns names for gluster
01:42 robinr yeah
01:42 rastar joined #gluster
01:42 dumbda as i want clients to mount those shares using hostnames
01:42 robinr it should work fine. i've setup with DNS names before.
01:42 robinr yes.
01:43 dumbda i am getting not a freined shit all over again
01:44 dumbda sigh
01:44 dumbda i am runnig of time for down time
01:44 dumbda hell.
01:44 robinr check to see cat /var/lib/glusterd/peers/*
01:45 robinr you should see similar to: uuid=c474c51f-c612-44fa-93b6-e73f3031ad98
01:45 robinr state=3
01:45 robinr hostname1=10.1.194.18
01:45 dumbda one sec
01:46 dumbda yeah i have that
01:46 kevein joined #gluster
01:46 dumbda but for the second server i am replicating to
01:47 dumbda but nothing for the server i am creatingvolumes from
01:47 dumbda uuid=9eb3b4de-9231-4a81-94e7-72216bb52af7 state=3 hostname1=autosupport-file-2.domain.com
01:47 robinr then, the gluster peer probe most likely have failed
01:47 dumbda well it did not failed for the remote server
01:48 dumbda what do you have there
01:48 dumbda 2 server uids?
01:48 dumbda host 1 one nad the second one?
01:48 robinr here is my second server: uuid=c474c51f-c612-44fa-93b6-e73f3031ad98
01:48 robinr state=3
01:48 robinr hostname1=10.1.194.18
01:48 dumbda yeah
01:48 robinr and
01:48 robinr uuid=2dcd10e9-072c-46c3-a118-99572aef9754
01:48 robinr state=3
01:49 robinr hostname1=10.1.194.19
01:49 dumbda so but my error tells me that the first server is not afriend
01:49 robinr so, each server lists the other one  in hostname1
01:51 robinr if one of your servers did not have anything under peers directory, then we need to fix that. one thing to double check is /etc/hosts entries on both servers.
01:53 dumbda yeah
01:53 dumbda so i have that
01:53 dumbda i have in 2d server
01:54 dumbda state=3 hostname1=autosupport-file-1.domain.com
01:54 dumbda and in the 1st server
01:54 dumbda state=3 hostname1=autosupport-file-2.domain.com
01:54 robinr hmm..
01:54 dumbda and in /etc/hosts on the 1st server
01:55 dumbda public ip autosupport-file-1.domain.com
01:55 dumbda maybe needs to be private ip on the server 1?
01:55 robinr do you have entry for autosupport-file-2.domain.com ?
01:57 robinr at autosupport-file-1.domain.com, make sure both autosupport-file-2.domain.com and autosupport-file-1.domain.com exist in /etc/hosts
01:57 robinr both entries need to exist at autosupport-file-2.domain.com's /etc/hosts file as well
02:01 robinr assuming they can communicate freely using either public or private, you just need to be consistent and choose either communicate using public or private.
02:01 robinr good luck
02:01 dumbda oh it is not
02:02 dumbda but autosupport-file-2.domain.com
02:02 dumbda is resolvabel
02:02 dumbda by dns
02:02 dumbda But dns is working
02:02 dumbda maybe some bug with lookup.
02:04 hagarth joined #gluster
02:12 bronaugh left #gluster
02:22 dumbda failed to create volume
02:26 dumbda sigh
02:30 dumbda i need some help
02:30 dumbda cannot create replica volume
02:47 vshankar joined #gluster
02:52 hagarth joined #gluster
02:59 dumbda still need some help please
03:22 rastar joined #gluster
03:26 ramkrsna joined #gluster
03:26 ramkrsna joined #gluster
03:41 timothy joined #gluster
03:47 yinyin joined #gluster
03:50 ramkrsna joined #gluster
03:56 ramkrsna joined #gluster
03:57 JoeJulian dumbda: Check localhost in /etc/hosts
03:58 dumbda Hi Joe.
03:58 dumbda I used your blog to fix it.
03:58 JoeJulian There's some vague recollection that something to do with missing localhost entries cause that.
03:58 JoeJulian Oh, cool.
03:58 dumbda I had to add 127.0.0.1 myhostname
03:59 JoeJulian Ok, that's just funny... vague recollection and it's something I blogged about... I must be getting old.
03:59 dumbda the error in the logs on the remote server was that it could not resolve the itself hostname
03:59 dumbda even though i could ping it.
03:59 dumbda i have another question
04:00 dumbda since i added replica to an existing volume with data around 100G
04:00 dumbda and triggered selfheal on the client
04:00 dumbda can start apache there on the client
04:00 dumbda i am accessing that mount on the web server
04:01 dumbda it seems like access time dropped drmatically, the thing is that the server 1 is on aws ec2 east region and replica server in west region
04:02 dumbda so the sync goes quite slowly
04:02 dumbda and site performance apparently is a drag.
04:02 dumbda but i can't afford downtime till self-heal is finished.
04:06 JoeJulian Yeah, replication across availability zones multiplies the latency.
04:08 pai joined #gluster
04:09 dumbda and glusterfs self-heal uses rsync?
04:09 sgowda joined #gluster
04:09 dumbda on the backend to replicate data?
04:10 dumbda as i see rsync process running.
04:16 JoeJulian Oh, you're using georeplicate. Yes, that uses rsync and that is the recommended tool for that.
04:19 dumbda well all i did was "gluster create volume data replica2 server1:/data server2:/data
04:19 dumbda i do not know if it is georeplication.
04:20 JoeJulian No, that's not, and that does not use rsync.
04:22 sgowda joined #gluster
04:25 yinyin joined #gluster
04:30 bala joined #gluster
04:33 vpshastry joined #gluster
04:33 shylesh joined #gluster
04:39 lalatenduM joined #gluster
04:45 saurabh joined #gluster
04:50 johnf joined #gluster
04:51 johnf Hi Does anyone know of any public info anywhere on who is using gluster? Giving it as a recommendation for a project and customer wants some comfort knowing some big names are using it
05:00 sripathi joined #gluster
05:01 ehg joined #gluster
05:05 yinyin joined #gluster
05:05 aravindavk joined #gluster
05:12 johnf left #gluster
05:15 31NAAA7BB joined #gluster
05:17 hagarth joined #gluster
05:19 rastar joined #gluster
05:21 vpshastry joined #gluster
05:28 deepakcs joined #gluster
05:33 yinyin joined #gluster
05:33 hagarth left #gluster
05:48 vpshastry joined #gluster
05:53 satheesh joined #gluster
06:02 glusterbot New news from newglusterbugs: [Bug 928631] Rebalance leaves file handler open <http://goo.gl/3Xruz>
06:03 raghu joined #gluster
06:05 jules_ joined #gluster
06:08 rotbeard joined #gluster
06:09 raghug joined #gluster
06:32 glusterbot New news from newglusterbugs: [Bug 916372] NFS3 stable writes are very slow <http://goo.gl/Z0gaJ>
06:37 ngoswami joined #gluster
06:38 ricky-ticky joined #gluster
06:58 guigui3 joined #gluster
07:00 vimal joined #gluster
07:07 vshankar joined #gluster
07:11 andreask joined #gluster
07:22 vshankar joined #gluster
07:24 mohankumar joined #gluster
07:26 ekuric joined #gluster
07:32 glusterbot New news from newglusterbugs: [Bug 928656] nfs process crashed after rebalance during unlock of files. <http://goo.gl/fnZuR>
07:37 timothy joined #gluster
08:00 timothy joined #gluster
08:05 tjikkun_work joined #gluster
08:10 glusterbot New news from resolvedglusterbugs: [Bug 764755] volume stop failed ,when one of the brick is not responding <http://goo.gl/69xwF>
08:39 ujjain joined #gluster
08:44 sripathi joined #gluster
08:44 piotrektt_ joined #gluster
08:46 xiu it seems that i have this problem: http://permalink.gmane.org/gmane.c​omp.file-systems.gluster.user/3594 but i can't find any solution (i'm running 3.2.6 on a distributed/replicated volume)
08:46 glusterbot <http://goo.gl/k2rE5> (at permalink.gmane.org)
08:49 hybrid512 joined #gluster
08:55 hybrid512 joined #gluster
09:02 glusterbot New news from newglusterbugs: [Bug 928685] fails to stop volume when can't access to a brick <http://goo.gl/jVt84>
09:14 vpshastry joined #gluster
09:25 Chiku|dc my volume is replicated on 2 gluster servers. When my client mount with glusterfs read data, does it read on 1 gluster server ? if 2 applications on my client read data, does it read only 1 gluster server ?
09:26 Chiku|dc then get better read performance...
09:27 Chiku|dc same thing if I mount with nfs ?
09:27 Staples84 joined #gluster
09:36 sripathi joined #gluster
09:49 dobber_ joined #gluster
09:55 sripathi joined #gluster
10:08 manik joined #gluster
10:13 raghug joined #gluster
10:14 alex88 hi guys
10:14 alex88 I've 2 clients, one works fine
10:14 alex88 the other, connecting, says XDR decoding error, failed to fetch volume file
10:14 alex88 but the cmndline to mount is the same
10:28 alex88 oh damn, forgot to apt-get update after installing ppa
10:33 raghug joined #gluster
10:37 joehoyle- joined #gluster
10:38 hagarth joined #gluster
10:57 shireesh joined #gluster
10:57 badone joined #gluster
10:59 puebele joined #gluster
11:13 NeatBasis joined #gluster
11:23 inodb_ joined #gluster
11:23 edong23_ joined #gluster
11:23 ninkotech joined #gluster
11:23 johnmorr joined #gluster
11:23 chlunde joined #gluster
11:23 vex joined #gluster
11:23 avati joined #gluster
11:23 m0zes joined #gluster
11:23 vex joined #gluster
11:23 kincl joined #gluster
11:23 kincl joined #gluster
11:23 NuxRo joined #gluster
11:23 dmojoryder joined #gluster
11:23 johndescs1 joined #gluster
11:23 frakt joined #gluster
11:23 logstashbot` joined #gluster
11:24 nonsenso joined #gluster
11:24 kkeithley1 joined #gluster
11:24 redsolar joined #gluster
11:24 Kins_ joined #gluster
11:24 Zengineer joined #gluster
11:24 MinhP joined #gluster
11:24 kkeithley1 left #gluster
11:24 kkeithley1 joined #gluster
11:26 penglish joined #gluster
11:27 RobertLaptop_ joined #gluster
11:31 atrius joined #gluster
11:31 johnmark joined #gluster
11:31 gluslog joined #gluster
11:31 copec joined #gluster
11:31 flrichar joined #gluster
11:31 dblack joined #gluster
11:31 roo9 joined #gluster
11:31 efries joined #gluster
11:31 yosafbridge joined #gluster
11:31 snarkyboojum joined #gluster
11:31 x4rlos joined #gluster
11:31 JordanHackworth joined #gluster
11:31 semiosis joined #gluster
11:31 flin joined #gluster
11:31 puebele1 joined #gluster
11:31 puebele1 joined #gluster
11:41 glusterbot New news from resolvedglusterbugs: [Bug 928685] fails to stop volume when can't access to a brick <http://goo.gl/jVt84> || [Bug 907202] Gluster NFS server rejects client connection if hostname is specified in rpc-auth <http://goo.gl/cxxJg>
11:41 logstashbot Title: Bug 928685 fails to stop volume when can't access to a brick (at goo.gl)
11:41 glusterbot Bug http://goo.gl/jVt84 unspecified, unspecified, ---, kparthas, CLOSED NOTABUG, fails to stop volume when can't access to a brick
11:50 puebele1 joined #gluster
11:53 joehoyle joined #gluster
12:03 vincent_vdk joined #gluster
12:07 bennyturns joined #gluster
12:13 yinyin joined #gluster
12:17 manik joined #gluster
12:25 shireesh joined #gluster
12:27 hagarth joined #gluster
12:28 flrichar
12:29 rosmo joined #gluster
12:29 balunasj joined #gluster
12:31 rosmo hi guys, i'm trying to add a new brand new peer to 3.3.1 but all i'm getting is State: Peer Rejected (Connected)
12:31 rosmo also a "Cksums of volume virt-pool-1 differ." in glusterd log
12:33 robo joined #gluster
12:46 aliguori joined #gluster
12:51 ProT-0-TypE joined #gluster
13:03 yinyin joined #gluster
13:03 glusterbot New news from newglusterbugs: [Bug 928781] hungs when mount a volume at own brick <http://goo.gl/ieOkk> || [Bug 918917] 3.4 Beta1 Tracker <http://goo.gl/xL9yF>
13:03 logstashbot Title: Bug 928781 hungs when mount a volume at own brick (at goo.gl)
13:03 glusterbot Bug http://goo.gl/ieOkk unspecified, unspecified, ---, kparthas, NEW , hungs when mount a volume at own brick
13:19 rwheeler joined #gluster
13:21 vpshastry left #gluster
13:23 semiosis logstashbot: leave
13:23 logstashbot semiosis: Error: "leave" is not a valid command.
13:23 semiosis logstashbot: part
13:23 logstashbot left #gluster
13:23 semiosis Oops.
13:24 hagarth joined #gluster
13:29 bennyturns joined #gluster
13:29 lalatenduM joined #gluster
13:30 robos joined #gluster
13:34 wN joined #gluster
13:36 johnmark semiosis: IRC hacking today? :)
13:37 lalatenduM joined #gluster
13:38 semiosis Every day
13:38 johnmark awesome
13:38 semiosis Brb drqz on stage now
13:38 johnmark drqz?
13:39 semiosis Neil gunther twitter @drqz
13:40 theron joined #gluster
13:40 semiosis Im at monitorama right now
13:41 lpabon joined #gluster
13:41 johnmark ah, ok
13:41 johnmark yeah, I figured. just didn't know who drqz was
13:42 * johnmark wants to have a good monitoring story for glusterfs
13:42 johnmark interesting blog post: http://www.eaglegenomics.com/2013/03/glusterfs-vs​-a-future-distributed-bioinformatics-file-system/
13:42 glusterbot <http://goo.gl/AiqfX> (at www.eaglegenomics.com)
13:42 johnmark sjoeboo: ^^
13:42 sjoeboo hmmm
13:46 semiosis johnmark: btw I'm giving out gluster stickers here :)
13:46 semiosis People like them
13:47 DataBeaver joined #gluster
13:49 DataBeaver Can you confirm something for me: If a file in .glusterfs only has a single link to it, that means the file has been deleted from the underlying filesystem directly, and can safely be deleted from .glusterfs as well?
13:49 dumbda joined #gluster
13:51 DataBeaver (This is a single-brick setup for sharing files between host and VM, and I just learned the deleted files stay in .glusterfs occupying disk space)
13:54 Scotch joined #gluster
13:55 raghug joined #gluster
13:55 BSTR joined #gluster
13:57 bugs_ joined #gluster
14:03 bennyturns joined #gluster
14:04 jbrooks joined #gluster
14:06 andreask joined #gluster
14:15 chouchins joined #gluster
14:23 dumbda is it possible to see status ofreplica sync?
14:23 dumbda in glusterfs?
14:24 plarsen joined #gluster
14:24 johnmark semiosis: woohoo!
14:24 samppah semiosis: heyyyy, i want one too ;)
14:25 johnmark he heh :)
14:25 johnmark samppah: can you help us arrange a gluster workshop where you are?
14:26 johnmark in that case, we can make all the stickers you want
14:27 samppah uh oh :)
14:30 xiu hi, it seems that i have this problem: http://permalink.gmane.org/gmane.c​omp.file-systems.gluster.user/3594 but i can't find any solution (i'm running 3.2.6 on a distributed/replicated volume), is this a known problem ?
14:30 glusterbot <http://goo.gl/k2rE5> (at permalink.gmane.org)
14:32 puebele joined #gluster
14:39 dustint joined #gluster
14:40 dblack joined #gluster
14:45 lh joined #gluster
14:46 dumbda guys how does self heal works?
14:46 dumbda is it continious?
14:46 dumbda i triggered it, but now i see for an hour that nothing comes to a replica server.
14:47 dumbda should i start "find" commad again on the fuse mount (client)?
14:47 zykure|uni JoeJulian: glusterfs is awesome, it worked out of the box with mixed IPoIB and ethernet-clients :)
14:48 zykure|uni using different routes on the ethernet-clients
14:55 rosmo oh no, i had to reboot my stuff and now both nodes show as "State: Peer Rejected (Connected)" and one brick is down from other server
14:57 rosmo brick from another server shows just as "Brick is Not Connected"
15:00 shylesh joined #gluster
15:00 Norky joined #gluster
15:04 aliguori joined #gluster
15:04 semiosis ,, (peer-rejected)
15:05 semiosis ,,(peer-rejected)
15:05 glusterbot http://goo.gl/nWQ5b
15:05 semiosis ,,(peer rejected)
15:05 glusterbot I do not know about 'peer rejected', but I do know about these similar topics: 'peer-rejected'
15:06 semiosis See link. Sorry I'm lagging
15:06 rosmo semiosis: tried that couple of times
15:06 rosmo as soon as i probe the server, it gets the full volume list
15:07 rosmo so volume sync says please delete everything
15:07 semiosis Restart all your glusterds
15:08 semiosis If it auto syncs then you don't need to sync manually
15:08 rosmo ahh.. let me see
15:09 rosmo yeah, as soon as i probe the server from the good one (detaching the old peer first), it gets the volume list
15:10 rosmo ahh, sorry, good one from the bad one
15:10 rosmo allright! now it says accepted peer request
15:12 rosmo sweet, it worked
15:12 rosmo thanks a million, went through dozens of pages of google results, but i didn't find that one
15:13 rosmo the article should probably be amended that you need to double-check stuff and go through the exact steps
15:15 zykure|uni rosmo: hey i had that problem just yesterday :)
15:15 zykure|uni seems to be very common
15:18 dumbda guys is there a way sync status on the volume in gluster?
15:18 dumbda other than df -h
15:19 dumbda on each brick
15:20 timothy joined #gluster
15:37 rwheeler joined #gluster
15:45 joaquim__ joined #gluster
15:46 ferrel joined #gluster
15:50 ferrel Hello all, I'm just getting started with geo-replication and GlusterFS in general. I'm wondering if someone could tell me if large geo-replicated VM disk image files are safe? These are files that are in use while being replicated. I assume there is some mechanism to make sure the files are consistent even though they are constantly changing? or maybe rsync handles this on it's own somehow? ... sorry for my ignorance ... just looki
15:50 ferrel ng for some "warm fuzzy" feelings on the state of our backups :-D
15:51 hateya joined #gluster
15:53 JoeJulian ferrel: As the file's change the marker translator marks them for geo-rep, then the gsyncd takes the files that are identified by marker and rsyncs them every so often. So your image will be in whatever state it was in when rsync last read it.
15:54 JoeJulian zykure|uni: Oh, good. Now write up a blog or wiki article on the specifics of how to make that work. :D
15:58 rastar joined #gluster
15:58 dumbda Joe is self-heal continious?
15:58 dumbda or it might stop in the middle after triggered.
16:00 ferrel JoeJulian: thanks for the explanation.
16:21 red_solar joined #gluster
16:21 JoeJulian dumbda: afaik ,self heal should be continuous. The only reason I can think that it wouldn't be is if the connection got interrupted.
16:24 bala joined #gluster
16:24 jbrooks joined #gluster
16:24 36DAABB0V joined #gluster
16:25 jclift joined #gluster
16:32 nueces joined #gluster
16:42 ferrel left #gluster
16:42 theron joined #gluster
16:42 saurabh joined #gluster
16:52 dumbda joined #gluster
16:52 dumbda Guys i need help
16:53 dumbda i can't access my mounts as self heal takes up all the bandwith.
16:53 dumbda How can i stop it?
16:53 dumbda It is production box.
16:54 daMaestro joined #gluster
16:55 jeffrin joined #gluster
16:56 jclift JoeJulian: ^^^ Any ideas?
17:00 dumbda my memory on gluster is up to the roof
17:00 dumbda why is the self heal is so insane?
17:00 dumbda Is there a way to stop it please.
17:01 dumbda i do not want to stop the deamon
17:03 dumbda God, please someone.
17:04 rwheeler joined #gluster
17:05 tomsve joined #gluster
17:09 guest1000 joined #gluster
17:10 semiosis dumbda: you can reduce the number of files being healed in parallel but I can't get you the command eight now
17:10 semiosis Right*
17:11 semiosis Google for background self heal count option
17:14 dumbda can i stop it completely on the volume till the weekend?
17:14 dumbda the self-heal process
17:15 hateya joined #gluster
17:16 semiosis Perhaps by killing it, see ,,(processes)
17:16 glusterbot the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more information.
17:16 semiosis Link
17:18 DataBeaver Let's try this again now that some people are present: Is it safe to nuke files from .glusterfs that only have a single link?  AFAIU deleting files directly from the storage volume causes them to dangle there.
17:18 edong23 joined #gluster
17:19 georgeh|workstat is anyone familiar with fuse clients showing stale NFS file handle messages?  this is a fuse mount not nfs mount of a volume
17:23 dumbda so it is ok to kill slef heal process and trigger it again on the weekend?
17:25 semiosis I don't know if it's "ok" but it is possible
17:30 dumbda yeah, i killed it and killed all correlated rsync processes and still mounts are so slow
17:30 dumbda cd into the folder and ls
17:30 dumbda takes forever.
17:32 dumbda and on the server something still triggering the rsync.
17:38 lalatenduM joined #gluster
17:41 hateya joined #gluster
17:53 JoeJulian dumbda: No. You won't be able to stop self-healing. Even if you stop the self-heal daemon, all that's going to do is move the self-healing to the clients.
17:53 JoeJulian What rsync processes???
17:53 dumbda what does it mean?
17:53 JoeJulian There are no rsync processes for self-heal.
17:54 dumbda i stopped self healing daemon for now
17:54 dumbda well i saw like 60 of them in D state
17:54 dumbda which consumed all the memory.
17:54 JoeJulian Again, all that means is that if a client touches a file that needs healed, that client will perform the heal.
17:54 JoeJulian fpaste "gluster volume info"
17:55 JoeJulian The only thing gluster would use rsync for is geo-replication.
17:55 dumbda Joe one sec i have to run to another issue in the other building.
17:55 dumbda If you can spare some of your time i can come by around 40 minutes.
17:56 dumbda I am sorry, that one is just as important.
17:57 dumbda need to roll in backup for the router, some idiot from our team did " no router bgp" on the building core router, what a  f day again.
17:57 sohoo joined #gluster
17:58 Mo___ joined #gluster
17:59 JoeJulian I'll be around.
18:01 sohoo hello everyone, how do i restart a single brick proccess? i see 4 bricks showing offline
18:02 sohoo im not sure i know how to debug this, any help will be great
18:02 ricky-ticky joined #gluster
18:03 JoeJulian On the server that has the failed bricks, "gluster volume start $vol force" will do it.
18:03 JoeJulian Make sure to check the brick logs first to see if there's an underlying reason for them to be offline.
18:07 sohoo thx joe where are the brick's logs?
18:09 hateya joined #gluster
18:10 jeffrin left #gluster
18:10 hateya joined #gluster
18:15 hagarth joined #gluster
18:31 plarsen joined #gluster
18:46 ricky-ticky joined #gluster
18:49 dumbda Joe still very slow.
18:49 dumbda even though i disabled daemon, freed up the memory.
18:49 dumbda but still not the same as it was yesterday before i added the replica.
18:50 dumbda it is still adding the files into
18:51 dumbda replica server from the client seems like.
18:55 hateya joined #gluster
18:56 ricky-ticky joined #gluster
18:56 zaitcev joined #gluster
18:58 dumbda I did cluster.data-self-heal off
18:58 dumbda but still can see packets to the replica comming from client.
18:58 dumbda is glusterfs when in replica sends new data to both servers?
18:59 hateya joined #gluster
19:08 elyograg dumbda: clients talk to all servers, yes.
19:09 andreask joined #gluster
19:11 dumbda right.
19:11 dumbda i see in tcpdump
19:11 dumbda that the client sends the data to both servers.
19:12 dumbda So apparently AWS with replicas in east coast -> west coast not working for me.
19:12 dumbda as the latency ridic big.
19:12 JoeJulian sohoo: /var/log/glusterfs/bricks
19:12 dumbda is there any way to improve the performance?
19:12 ricky-ticky joined #gluster
19:13 dumbda Joe on the server?
19:13 dumbda or client?
19:13 JoeJulian dumbda: Yes, don't replicate across high latency connections. :D
19:13 msmith_ Hmm, my gluster has started doing a lot of cpu/network for very little work.  6 servers, each has a single raid6 brick. volume is (3x2 (replica2)).  the traffic to each cluster server is about 150kbps, but the cluster traffic is about 20Mbps
19:14 JoeJulian msmith_: "gluster volume heal $vol info" to see if there's heals pending. Sounds like maybe a client lost connection with a server, if I were going to guess. Check your logs.
19:15 dumbda So Joe the only solution for me in this case schedule downtime
19:16 dumbda remove that peer
19:16 dumbda stop the volume
19:16 dumbda delete it, and recreate a new one as it was before
19:16 dumbda distributed one.
19:17 JoeJulian I /think/ you can do the remove-brick process to delete the bricks from the other coast, and add "replica 1" to do that.
19:18 JoeJulian so "gluster volume $foo remove-brick replica 1 eastcoast1:/brick/foo eastcoast2:/brick/foo eastcoastN:/brick/foo"
19:18 JoeJulian but I may be wrong.
19:19 dumbda iam just affraid this command might erase data on FS
19:19 dumbda then i'll get fired -))).
19:19 msmith_ joe: it shows 1 entry, 23, 1, 1, 1, 1 for each brick respectively.  about half are .lock files, and total estimated space for everything listed might be <150M
19:19 JoeJulian I'm a bit busy trying to figure out why the stupid windows clients in our corporate office are now refusing to print to one single printer, configured exactly the same as every other printer on our cups/samba server.
19:20 JoeJulian remove-brick does not erase data.
19:20 dumbda Oh ok, then i will try that.
19:20 dumbda once i removed the breaks
19:20 dumbda bricks
19:20 dumbda i can remove that peer afterwards.
19:20 dumbda ?
19:21 msmith_ just re-ran it again, and now its 12, 3, 2, 1, 25, 25 entries each
19:21 msmith_ does that mean my system is in a constant state of healing?
19:22 JoeJulian Looks that way.
19:23 JoeJulian I've seen that before when one client has lost (and is not successful in reattaching to) one server.
19:23 JoeJulian Of course, that also could just represent a transient state and be completely unrelated.
19:24 JoeJulian What application is creating so many .lock files?
19:25 msmith_ dovecot mail, .lock's the files before updating them
19:29 dumbda Joe but my original command was "gluster create volume $foo replica 2 transport tcp server1:/foo server2:/foo
19:29 dumbda so when removing the break replica2 will change to replica1
19:29 dumbda and the most right handed brick will be removed.
19:29 dumbda ?
19:30 JoeJulian For that command, "gluster volume remove-brick $foo replica 1 server2:/foo"
19:31 JoeJulian For that command, "gluster volume remove-brick $foo replica 1 server2:/foo force"
19:31 JoeJulian That will make $foo a distribute only volume on just server1.
19:31 JoeJulian But if there's more servers, you'll need to list all the right-hand bricks.
19:32 dumbda Okay Thanks.
19:32 dumbda Apparently before doing that i need to unmount all the clients?
19:38 msmith_ heh, getting a blistering 29 iops per second through the fuse, with it in it's current state
19:39 dumbda Or it is not necessary to unmount volume from the client?
19:43 msmith_ so far i've seen glusterfsd peak at 325% cpu in top.  averaging around 60-70%
19:45 elyograg dumbda: in my experiences with 3.3.x, brick operations do not take down the volume.
19:45 johnmark anybody have experience exporting mediawiki to markdown?
19:46 dumbda Thank you.
19:47 elyograg you just have to be absolutely sure you're removing the correct things, but that's always the best way to work.
19:48 dumbda well i will remove brick from server2
19:48 dumbda i am sure about this one
19:48 dumbda if i will remove on the first server.
19:48 dumbda it will erase all data from FS?
19:49 dumbda I would not think so, as glusterfs implementation is just metadata as far as i understand.
19:49 elyograg removing bricks doesn't actually delete anything. the actual brick filesystems will be unaffected.
19:49 dumbda Thanks.
19:53 dumbda Thank you very much it worked just fine.
20:05 joehoyle joined #gluster
20:13 Azrael joined #gluster
20:40 \_pol joined #gluster
20:45 tomsve joined #gluster
20:49 jclift dumbda: Just read though all the struggles there.  It seems like you're trying out commands you're unsure of in a live production environment.
20:50 jclift dumbda: As a way to reduce the danger, it might be a good idea to install Gluster in a VM sometime, so you can try out the commands there first and make sure they do what you want.
20:50 jclift dumbda: Just a thought anyway. :D
20:51 dumbda yeah.
20:52 dumbda I could spin up a test AWS instance before, but it would be struggle to replicate all the data flows we receive from the web front, where gluser share is mounted.
20:53 dumbda Thanks, though.
20:59 kostagr33k joined #gluster
21:00 kostagr33k hey guys... having a problem where a replace-brick went awry and now i cant cancel/abort the command
21:00 kostagr33k is there a way to clear that out or is that truely a bug from what i have been seeing/
21:18 JoeJulian jclift: Real sysadmins do it in production....
21:18 jclift JoeJulian: :p
21:19 jclift JoeJulian: Heh, the "learning on the job" thing... not sure if "lets reconfigure the distributed storage system of our prod servers, that I'm not all that familiar with" is a good example of doing it properly. :)
21:20 elyograg http://2.bp.blogspot.com/--QltsGDvxe0/TbiNVZEmSXI​/AAAAAAAAAMA/3Vyos1hrvqk/s1600/Dos-Equis-Man.jpg
21:20 JoeJulian kostagr33k: You can probably finish the job with commit or commit force.
21:20 glusterbot <http://goo.gl/SQe4c> (at 2.bp.blogspot.com)
21:20 jclift elyograg: :)
21:20 elyograg second time i've pasted that today. :)
21:21 kostagr33k did not try commit force
21:21 kostagr33k will check that
21:21 kostagr33k that may have done the trick .. so if you have a failed command and commit force it, it will just close out the command without doing any further interaction?
21:28 JoeJulian commit force will force the conclusion of the replace-brick. If this is a replicated volume, I would recommend a heal...full to ensure that any files that didn't get migrated during the failed operation are repaired.
21:36 ramkrsna joined #gluster
21:49 hateya joined #gluster
22:05 ramkrsna joined #gluster
22:16 _br_ joined #gluster
22:22 _br_ joined #gluster
22:23 kostagr33k thanks JoeJulian
22:23 _br_ joined #gluster
22:28 ramkrsna joined #gluster
22:28 JoeJulian kostagr33k: You're welcome. :D
22:29 kostagr33k but for some reason the cluster is hosed on one noede :/
22:29 kostagr33k wont startup
22:29 kostagr33k any idea what this means:  E [graph.c:294:glusterfs_graph_init] 0-management: initializing translator failed
22:32 _pol joined #gluster
22:37 disarone joined #gluster
22:38 k7__ joined #gluster
22:38 JoeJulian No idea... That's the "last straw" though. Maybe fpaste the rest of the log...
22:40 kostagr33k its odd
22:40 kostagr33k had one file 0 bytes
22:40 kostagr33k checked th other servers, replacedd that one file and it started up
22:41 kostagr33k seemed it forgot baout one of the peers. super odd
22:44 ninkotech joined #gluster
22:44 ninkotech__ joined #gluster
23:30 wenzi joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary