Camelia, the Perl 6 bug

IRC log for #gluster, 2013-10-23

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:12 Xunil what are folks doing for HA with gluster?  i recognize you can have a volume duplicate its data across multiple servers, but what about for the NFS mount?  most services i deploy have 2 or more servers behind a shared IP
00:13 Xunil sometimes that shared IP is plumbed on a load balancer, sometimes it's just a manually-plumbed secondary IP on an ethernet interface on one of the servers
00:23 vpshastry joined #gluster
00:24 vpshastry left #gluster
00:46 yinyin joined #gluster
00:53 johnbot11 joined #gluster
01:18 chirino joined #gluster
01:19 Xunil there's a lot of cautions in the Admin Manual about only using striping for 'high concurrency environments with large files'; is this an unstable feature?  does it come with some non-obvious performance penalty?
01:19 JoeJulian @stripe
01:19 glusterbot JoeJulian: Please see http://goo.gl/5ohqd about stripe volumes.
01:19 Xunil also there's a lot of broken links on the gluster site :(
01:19 Xunil JoeJulian: thanks :) reading
01:20 JoeJulian Most people use one of the floating-ip services for nfs. I prefer the fuse mount to avoid all that.
01:20 Xunil will the fuse subsystem handle a down server gracefully, then?
01:22 JoeJulian yes
01:22 JoeJulian assuming a replicated volume.
01:22 Xunil right, of course.
01:22 Xunil my use case is write-mostly
01:22 Xunil of very large files, on the order of hundreds of GB
01:23 Xunil (compressed, encrypted database backups)
01:24 JoeJulian The fuse client connects to all the bricks in the volume via TCP. Writes will go to replicas simultaneously*. If a server is taken down, it gracefully closes its tcp connection and the client is then aware that it's gone. The client continues working with the remaining replica. When the missing brick returns, the client reconnects... (and does magic if necessary)
01:25 Xunil nice, that's pretty cool
01:25 Xunil and definitely sounds like the behavior i want
01:27 Xunil for this use case i'm not as concerned about I/O performance as i am about reliability
01:28 JoeJulian Well with replication you're going to be splitting your bandwidth. Ensure you have as much of that as you can get. Raid your drives to ensure they can keep up and you're golden.
01:29 JoeJulian Gotta run. Dinner time.
01:29 Xunil JoeJulian: thanks for the advice
01:41 bala joined #gluster
02:01 lpabon joined #gluster
02:02 bayamo joined #gluster
02:04 jag3773 joined #gluster
02:06 nasso joined #gluster
02:27 Xunil possibly dumb question: i have a replicate-distribute volume, i created 1000 files on it, but i only see them in the brick mounts on the first server
02:27 Xunil why/
02:27 Xunil ?
02:42 johnbot11 joined #gluster
02:46 JoeJulian Xunil: Good question. ,,(pasteinfo)
02:46 glusterbot Xunil: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
02:47 johnbot11 Hope someone can share some advice. I read in some (possibly old) AWS related Gluster document that it's extremely unwise to configure Gluster servers to use the internal address since they may change once in awhile in AWS and poof there goes part or all of your filesystem. I run in a VPC so I would believe that my internal IP's should stay the same. Since I already setup the GLusterfs with internal (VPC) ip
02:47 johnbot11 ip's should I be worried?
02:47 Xunil JoeJulian: ok, one sec
02:47 JoeJulian I can share advice: it's okay to date nuns...
02:47 JoeJulian ... as long as you don't get in the habit.
02:48 johnbot11 nice ;)
02:48 JoeJulian johnbot11: I think you're fine.
02:48 Xunil JoeJulian: https://dpaste.de/v5gw
02:48 JoeJulian though, I always prefer to use ,,(hostnames)
02:48 glusterbot Title: dpaste.de: Snippet #244603 (at dpaste.de)
02:48 glusterbot Hostnames can be used instead of IPs for server (peer) addresses. To update an existing peer's address from IP to hostname, just probe it by name from any other peer. When creating a new pool, probe all other servers by name from the first, then probe the first by name from just one of the others.
02:48 Xunil er whoops
02:48 Xunil that's node
02:49 JoeJulian Heh, that's peer status
02:49 Xunil https://dpaste.de/jrwb
02:49 glusterbot Title: dpaste.de: Snippet #244604 (at dpaste.de)
02:49 Xunil there we go
02:49 johnbot11 Thanks JoeJulian
02:51 johnbot11 just curious, if once of the 5 servers hosting my bricks changes IP (which I don't believe will happen), I would not longer be able to access just those bricks (and spread out data) that the bricks contained right?
02:51 johnbot11 one not once
02:51 JoeJulian Xunil: So your 1000 files were created in cam10.yvr1.quux:/data/brick{1,2,3} but spread across them?
02:51 JoeJulian JonathanD: right.
02:51 JoeJulian JonathanD: Not you.... Sorry
02:52 JoeJulian johnbot11: I missed the "h"... right.
02:52 johnbot11 oops
02:52 JoeJulian tab completion sucks sometimes.
02:52 Xunil JoeJulian: yep, looks that way
02:53 JoeJulian Xunil: But they weren't replicated to the other servers.
02:53 Xunil JoeJulian: when i created this i specified cam10:/data/brick1 cam15:/data/brick1 cam16:/data/brick1 cam10:/data/brick2...
02:53 Xunil i wonder if the ordering is what caused this
02:53 Xunil JoeJulian: afaict, no, not replicated elsewhere
02:53 JoeJulian Your ordering looks right. ,,(brick order)
02:53 glusterbot Replicas are defined in the order bricks are listed in the volume create command. So gluster volume create myvol replica 2 server1:/data/brick1 server2:/data/brick1 server3:/data/brick1 server4:/data/brick1 will replicate between server1 and server2 and replicate between server3 and server4.
02:53 Xunil ok
02:54 JoeJulian Xunil: check your client log. Perhaps it's not connecting to the other bricks for some reason. (usually iptables)
02:54 Xunil righty-o
02:55 Xunil 0-gv0-client-4: readv failed (No data available)
02:55 Xunil seems like that might be the issue :)
02:55 JoeJulian check gluster volume status
02:55 johnbot11 JoeJulian: I read somewhere awhile ago that a cloud/aws glusterfs best practice document was in the works. happen to know of any floating around?
02:56 JoeJulian hrm...
02:56 JoeJulian @semiosis tutorial
02:56 glusterbot JoeJulian: http://goo.gl/6lcEX
02:56 Xunil JoeJulian: gluster volume status shows everything online, but i have iptables rules that disallow most of the ports i see in netstat -plant | grep gluster | grep LISTEN
02:56 JoeJulian ooh, that was the factoid I was thinking of...
02:56 JoeJulian @ports
02:56 glusterbot JoeJulian: glusterd's management port is 24007/tcp and 24008/tcp if you use rdma. Bricks (glusterfsd) use 24009 & up for <3.4 and 49152 & up for 3.4. (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111 and 2049 since 3.4.
02:57 Xunil JoeJulian: is there a comprehensive list of the ports used by gl... heh, thanks :)
02:57 Xunil hmm
02:57 Xunil ok
02:57 Xunil no way to limit it to predefined ports, i suppose
02:57 JoeJulian Nope
02:58 JoeJulian There's a ,,(puppet module) that does the firewall magic though.
02:58 glusterbot JoeJulian: Error: No factoid matches that key.
02:58 JoeJulian @puppet
02:58 glusterbot JoeJulian: (#1) https://github.com/purpleidea/puppet-gluster, or (#2) semiosis' unmaintained puppet module: https://github.com/semiosis/puppet-gluster
02:58 JoeJulian #1
03:01 MrNaviPa_ joined #gluster
03:01 bharata-rao joined #gluster
03:03 kshlm joined #gluster
03:07 semiosis i think i just uploaded the source package to the 3.4.1 apt repo
03:08 semiosis @later tell jord-eye check the 3.4.1 debian repo for the source package & let me know if that works.  thanks!
03:08 glusterbot semiosis: The operation succeeded.
03:08 shubhendu joined #gluster
03:27 semiosis @later tell partner check the 3.4.1 and 3.3.2 debian repos for the source packages & let me know if that works.  thanks!
03:27 glusterbot semiosis: The operation succeeded.
03:30 Xunil interesting
03:31 Xunil i fixed the firewall issues, but the files i'd originally created didn't seem to get replicated
03:31 Xunil i created new files, and those *do* show up on the other brick servers
03:33 semiosis Xunil: what version of glusterfs?
03:34 Xunil 3.4, just installed
03:34 Xunil # rpm -q glusterfs
03:34 Xunil glusterfs-3.4.1-2.el6.x86_64
03:34 mohankumar joined #gluster
03:35 semiosis thole files should be automatically healed at some point, or when you access them (even just doing an ls -la or stat them) through the client, or running gluster volume heal $vol full, iirc
03:35 Xunil ok
03:35 Xunil i'm just peeking at the exports on the brick servers
03:35 semiosis right
03:36 Xunil aha, the heal fix it
03:36 Xunil fixed*
03:36 semiosis good
03:36 semiosis afaik that is supposed to happen automatically
03:36 Xunil i guess i expected that to happen automatically when the connectivity was restored
03:36 semiosis but idk when or how often
03:37 Xunil nod
03:37 Xunil network partitions like that certainly aren't the norm in my environment, so i'm not too worried
03:38 Xunil it is pretty encouraging that the firewall problems caused zero problems for the client
03:39 Raymii joined #gluster
03:45 RameshN joined #gluster
03:52 itisravi joined #gluster
03:57 dusmant joined #gluster
03:57 sgowda joined #gluster
03:58 ziiin joined #gluster
03:59 shylesh joined #gluster
04:15 sac`away joined #gluster
04:17 ndarshan joined #gluster
04:25 RameshN joined #gluster
04:28 vpshastry joined #gluster
04:29 kanagaraj joined #gluster
04:37 ppai joined #gluster
04:40 ndarshan joined #gluster
04:42 meghanam joined #gluster
04:43 meghanam_ joined #gluster
04:45 rjoseph joined #gluster
04:59 psharma joined #gluster
05:08 nshaikh joined #gluster
05:08 johnbot1_ joined #gluster
05:08 anands joined #gluster
05:12 XpineX joined #gluster
05:14 spandit joined #gluster
05:18 ababu joined #gluster
05:21 aravindavk joined #gluster
05:22 CheRi_ joined #gluster
05:28 Skaag joined #gluster
05:34 bala joined #gluster
05:50 raghu joined #gluster
05:53 lalatenduM joined #gluster
06:07 RameshN joined #gluster
06:08 cnfourt joined #gluster
06:13 mbukatov joined #gluster
06:26 satheesh1 joined #gluster
06:28 ngoswami joined #gluster
06:44 spandit joined #gluster
06:47 shyam joined #gluster
06:56 rastar joined #gluster
07:01 ekuric joined #gluster
07:04 ricky-ticky joined #gluster
07:05 keytab joined #gluster
07:06 ctria joined #gluster
07:11 Raymii joined #gluster
07:21 eseyman joined #gluster
07:21 rjoseph joined #gluster
07:37 jord-eye semiosis: thank you for the sources. Nevertheless they didn't work on squeeze. Some changes have to be done in debian directory. Here is the patch: http://ur1.ca/fxang
07:37 jord-eye basically, debhelper on squeeze is version 8, and some path problems.
07:38 jord-eye there's too fuse dependency, it is called 'fuse-utils' on squeeze, instead of just 'fuse'
07:39 jord-eye after the patch they compiled ok and I have them installed and working
07:40 jord-eye maybe is worth to notice too these warnings:
07:40 jord-eye dpkg-gencontrol: warning: Depends field of package glusterfs-client: unknown substitution variable ${shlibs:Depends}
07:40 jord-eye dpkg-gencontrol: warning: Pre-Depends field of package glusterfs-common: unknown substitution variable ${misc:Pre-Depends}
07:40 jord-eye dpkg-gencontrol: warning: Depends field of package glusterfs-dbg: unknown substitution variable ${shlibs:Depends}
07:40 _Riper_ joined #gluster
07:40 _Riper_ hi
07:40 glusterbot _Riper_: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
07:40 jord-eye I don't know if they caused by debhelper version...
07:42 _Riper_ first of all, sorry for my english. That's the first time that I use this chat. Do I have to explain my doubt/problem here, open wide?
07:42 _Riper_ thanks
07:44 andreask joined #gluster
07:49 jord-eye hi _Riper_. I'm afraid I can't help you. I'm a simple user, not admin of the this channel. I just leave a message for later
07:50 jord-eye I guess the other users are just... sleeping :)
07:50 _Riper_ ok, thanks jord-eye
07:50 partner semiosis: brilliant! thanks i'll check the stuff out and let you know how it worked
07:51 _Riper_ i know that's hard to explain this kind of errors
07:56 _Riper_ I'm running two glusterfs servers. Those servers runs SLES11 and gluster3.0.5. The client side are more heterogenious, they runs (20%) debian and others (80%) runs SLES11 too.
07:57 _Riper_ clients writes to two clients at same time, replicating data.
07:57 _Riper_ Few days ago I had network problems with the second gluster server, and the clients only writes to the first one.
07:58 _Riper_ the problem was transparent for the users (and for me!  :-(  )
07:58 _Riper_ when I fix the problem with the network, then there was a lot of problems on the client side.
07:59 _Riper_ I've tried to do some healing tricks that I found on internet, like the recursive "find" or "ls" command executed from the client side
08:00 _Riper_ I hope that this tricks find the differences between the server's1 data and server's2 data and fix it... but it doesn't.
08:00 _Riper_ two days ago and there's still more data in server's 1 than in server's 2
08:01 _Riper_ but that's not really the big problem. The most annoying thing is that sometimes this message appears in client side: ls: reading directory .: File descriptor in bad state
08:02 _Riper_ I've tryied to restart (software, not hardware) on server and client side... but the problem is still there.
08:02 _Riper_ this error (ls: reading directory .: File descriptor in bad state) appears randomly
08:02 _Riper_ maybe at third, second, sixth time that you try to "ls" the content of the folder, then the data appears
08:03 _Riper_ I'm lost with that
08:04 _Riper_ I can provide configuration files and error logs to anyone that can helps me
08:04 _Riper_ thank you.
08:05 calum_ joined #gluster
08:06 _Riper_ ERROR: few lines before, I said:  "clients writes to two clients at same time, replicating data." and it's not correct. I mean:   "clients writes to two gluster servers at same time, replicating data."
08:10 kPb_in_ joined #gluster
08:13 _pll_ joined #gluster
08:17 DV joined #gluster
08:19 partner jord-eye: i see you resolved the issues already, i'll just steal your work then, thank you very much :)
08:19 jord-eye sure! :)
08:28 mgebbe_ joined #gluster
08:29 kiwi_64450 joined #gluster
08:29 kiwi_64450 left #gluster
08:44 StarBeast joined #gluster
08:44 spandit joined #gluster
08:45 hngkr_ joined #gluster
08:47 anands joined #gluster
08:54 vimal joined #gluster
08:56 asias joined #gluster
09:09 tryggvil joined #gluster
09:10 dneary joined #gluster
09:24 morse joined #gluster
09:25 baoboa joined #gluster
09:25 vshankar joined #gluster
09:28 shruti joined #gluster
09:44 ngoswami joined #gluster
09:45 dneary joined #gluster
09:55 rastar joined #gluster
10:07 X3NQ joined #gluster
10:11 ngoswami joined #gluster
10:12 khushildep joined #gluster
10:22 samppah hagarth: ping?
10:30 hagarth samppah: pong
10:31 ngoswami joined #gluster
10:33 RameshN joined #gluster
10:34 samppah hagarth: hmmh.. i'm not sure if this is happening because of gluster or ovirt, but i'm seeing glusterfs client crash at the end of add-brick and rebalance..
10:34 samppah and
10:35 samppah i'm wondering that is it possible that it's crashing because i'm using for testing same lvm slice for brick1 and brick2?
10:35 samppah ie. lvm slice is mounted at /gluster and bricks are /gluster/brick1 and /gluster/brick2
10:36 hagarth samppah: do you have a backtrace?
10:41 asias joined #gluster
10:43 samppah hagarth: just a moment.. i'll try to rerun the test
10:47 shubhendu joined #gluster
10:47 harish_ joined #gluster
10:49 cyberbootje joined #gluster
10:52 jmeeuwen joined #gluster
10:54 RameshN joined #gluster
10:56 samppah hagarth: Commands run on server http://pastie.org/8423691, Client log http://pastie.org/8423690 and backtrace http://pastie.org/8423689
10:56 glusterbot Title: #8423690 - Pastie (at pastie.org)
10:56 samppah please let me know if you need more info from gdb.. i'm not that familiar with it :)
10:59 hagarth samppah: can you try gdb /usr/sbin/glusterfs core.10936.1382525558.dump
11:00 samppah hagarth: http://pastie.org/8423699
11:00 glusterbot Title: #8423699 - Pastie (at pastie.org)
11:13 hagarth samppah: can you please open a bug for this one?
11:13 samppah hagarth: sure
11:24 edward2 joined #gluster
11:24 ppai joined #gluster
11:28 CheRi_ joined #gluster
11:29 vpshastry1 joined #gluster
11:30 bayamo joined #gluster
11:36 ndarshan joined #gluster
11:43 RameshN joined #gluster
11:44 ctria joined #gluster
11:46 kkeithley partner, JoeJulian: I did dpkgs for Debian for 3.3.0 IIRC. It was so painful I've never done it again. (Doctor, it hurts when I go like this.)
11:48 tryggvil joined #gluster
11:55 Raymii joined #gluster
12:00 mbukatov joined #gluster
12:01 shubhendu joined #gluster
12:09 B21956 joined #gluster
12:11 ababu joined #gluster
12:11 jikz joined #gluster
12:11 ctria joined #gluster
12:13 Raymii joined #gluster
12:16 Raymii joined #gluster
12:17 Raymii joined #gluster
12:17 itisravi joined #gluster
12:19 dusmant joined #gluster
12:27 DV joined #gluster
12:36 rcheleguini joined #gluster
12:37 onny1 joined #gluster
12:39 haritsu joined #gluster
12:44 kkeithley1 joined #gluster
12:47 hybrid512 joined #gluster
12:47 samppah hagarth: https://bugzilla.redhat.co​m/show_bug.cgi?id=1022510 i hope i rememberd everything.. it has been busy day
12:47 glusterbot <http://goo.gl/LLUJAH> (at bugzilla.redhat.com)
12:47 glusterbot Bug 1022510: unspecified, unspecified, ---, amarts, NEW , GlusterFS client crashes during add-brick and rebalance
12:47 samppah thanks for your help :)
12:51 hagarth samppah: thanks for the report
12:52 vpshastry joined #gluster
12:53 vpshastry left #gluster
12:54 klaxa|web joined #gluster
12:57 marbu joined #gluster
13:00 lalatenduM joined #gluster
13:03 calum_ joined #gluster
13:05 glusterbot New news from newglusterbugs: [Bug 1022510] GlusterFS client crashes during add-brick and rebalance <http://goo.gl/LLUJAH>
13:15 mbukatov joined #gluster
13:16 anands joined #gluster
13:24 kkeithley1 joined #gluster
13:31 bennyturns joined #gluster
13:38 kkeithley1 joined #gluster
13:40 shylesh joined #gluster
13:41 vpshastry joined #gluster
13:45 bsaggy joined #gluster
13:49 bala joined #gluster
13:52 bugs_ joined #gluster
13:55 saurabh joined #gluster
13:55 vpshastry left #gluster
13:58 plarsen joined #gluster
13:59 mohankumar joined #gluster
14:04 calum_ joined #gluster
14:06 DV joined #gluster
14:07 wushudoin joined #gluster
14:10 dneary joined #gluster
14:13 failshell joined #gluster
14:23 klaxa|work joined #gluster
14:30 klaxa|work hi, is there a way to observe the self heal daemon in 3.3.2? or rather, my actual problem is, how do i find out how long the replication with granular locking takes until the data on both bricks is synchronized again?
14:30 calum_ joined #gluster
14:42 onny1 klaxa|work: interesting question
14:47 glusterbot New news from resolvedglusterbugs: [Bug 951549] license: xlators/protocol/server dual license GPLv2 and LGPLv3+ <http://goo.gl/WqDBI5> || [Bug 951551] license: xlators/protocol/server dual license GPLv2 and LGPLv3+ <http://goo.gl/a1y1LE>
14:49 B21956 joined #gluster
14:50 fuzzy_id joined #gluster
14:51 fuzzy_id i'm running gluster 3.4, i had a replicate_count=2 running with two nodes, one brick per node
14:52 fuzzy_id i added a third node via gluster volume add-brick vol-name replica 3 new_brick
14:52 fuzzy_id now the new node and brick is listed in gluster volume info vol-name
14:52 fuzzy_id but there is no data written to the new brick
14:52 fuzzy_id i tried gluster volume rebabalance vol-name start
14:53 gkleiman joined #gluster
14:53 gkleiman_ joined #gluster
14:53 fuzzy_id and got "failed: Volume vol-name is not a distribute volume or contains only 1 brick."
14:54 fuzzy_id and now i'm running out of ideas :/
14:58 andreask fuzzy_id: you tried a "heal" instead of a rebalance?
14:59 fuzzy_id yep
15:00 fuzzy_id 'Number of entries: 0' on the freshly added brick, is this normal?
15:00 andreask I'd expect to see all files
15:01 fuzzy_id yeah, me too
15:03 fuzzy_id heal-failed doesn't show any files neither
15:03 jag3773 joined #gluster
15:04 andreask hmm ... have you tried forcing the heal by adding the "full" keyword?
15:04 fuzzy_id yep
15:04 fuzzy_id same results
15:05 fuzzy_id just to be sure: increasing the replicate count the way i did it is supported, isn't it?
15:05 andreask yes, looks ok
15:05 andreask and touching a file replicates correctly?
15:05 fuzzy_id nope
15:06 fuzzy_id i don't see the file on the freshly added brick…
15:07 satheesh joined #gluster
15:08 fuzzy_id but peer status and volume info show the host and the added brick and tell me that everything is fine :/
15:10 andreask can you pastebin the volume info gluster volume status _volume_ ... and with detail
15:10 fuzzy_id also stopping all daemons and then restarting them didn't succeed
15:10 andreask strange
15:12 fuzzy_id oh, i just noticed that the new brick is not listed in volume status
15:17 hngkr_ joined #gluster
15:21 fuzzy_id http://pastebin.com/NCLDV6Pm
15:21 glusterbot Please use http://fpaste.org or http://paste.ubuntu.com/ . pb has too many ads. Say @paste in channel for info about paste utils.
15:22 fuzzy_id ok, so again on fpaste: http://ur1.ca/fxfzm
15:22 glusterbot Title: #48908 Fedora Project Pastebin (at ur1.ca)
15:23 fuzzy_id no sure, why the volume doesn't show up in volume status but shows up volume info
15:24 fuzzy_id -volume +brick
15:24 andreask fuzzy_id: info is a view of the configuration
15:24 andreask glusterd and glusterfs  processes are running on the third node?
15:25 fuzzy_id glusterfs does
15:26 fuzzy_id but glusterfsd doesn't seem to run
15:31 fuzzy_id hmm, just started glusterd --debug
15:31 fuzzy_id lots of
15:31 fuzzy_id [2013-10-23 15:30:17.962943] D [socket.c:486:__socket_rwv] 0-socket.management: EOF on
15:31 fuzzy_id socket
15:31 fuzzy_id [2013-10-23 15:30:17.962987] D [socket.c:2236:socket_event_handler] 0-transport: disco
15:31 fuzzy_id nnecting now
15:31 fuzzy_id
15:31 fuzzy_id is this normal?
15:32 calum_ joined #gluster
15:35 fuzzy_id well, this is annoying: just rebooted, now glusterd is running, but glusterfs isn't
15:36 andreask name resolution working? no firewall that blocks?
15:40 klaxa|web joined #gluster
15:40 fuzzy_id just double checked that, seems ok
15:40 giannello joined #gluster
15:41 johnbot11 joined #gluster
15:43 andreask now the processes show up for the third node?
15:44 andreask brick filesystem mounted after the reboot?
15:44 fuzzy_id yeah, glusterd as well as glusterfs
15:44 ncjohnsto joined #gluster
15:44 fuzzy_id yep
15:45 andreask and if you mount the glusterfs locally on the third node you can access all files?
15:46 fuzzy_id yeah, i can access, but it hangs when i touch one
15:47 andreask errros in the gluster logs?
15:48 andreask can the first two hosts resolve the third correctly?
15:48 fuzzy_id E [afr-self-heal-entry.c:2296:afr​_sh_post_nonblocking_entry_cbk] 0-lidl-lead-vol-replicate-0: Non Blocking entrylks failed for /.
15:49 fuzzy_id E [afr-self-heal-common.c:2212:​afr_self_heal_completion_cbk] 0-lidl-lead-vol-replicate-0: background  entry self-heal failed on /
15:49 fuzzy_id this seems to be for the mount
15:49 fuzzy_id do you mean a reverse lookup?
15:50 fuzzy_id that works
15:50 hagarth joined #gluster
15:50 fuzzy_id A lookup is also correct on all the machines
15:51 joshcarter joined #gluster
15:51 fuzzy_id W [glusterd-op-sm.c:3170:glusterd_op_modify_op_ctx] 0-management: op_ctx modification failed
15:52 fuzzy_id this is from etc-glusterfs-glusterd.vol.log
15:52 fuzzy_id on the third node…
15:55 andreask and on the other nodes you can access the files?
15:59 fuzzy_id yep
15:59 fuzzy_id accessing and creating is no problem there
16:00 B21956 left #gluster
16:00 andreask strange, maybe a bug? you have 3.4.1?
16:00 kkeithley1 joined #gluster
16:00 fuzzy_id yes
16:01 fuzzy_id from semiosis ppa
16:02 andreask sorry, maybe someone else or the mailinglist has an idea
16:02 fuzzy_id ok
16:02 fuzzy_id thanks anyway
16:02 andreask yw
16:08 chirino joined #gluster
16:12 kshlm joined #gluster
16:12 chouchins joined #gluster
16:26 Mo_ joined #gluster
16:32 ctria joined #gluster
16:36 mohankumar joined #gluster
16:36 johnbot11 joined #gluster
16:43 LoudNoises joined #gluster
16:44 _pol joined #gluster
16:47 daMaestro joined #gluster
16:50 kaushal_ joined #gluster
16:52 Raymii joined #gluster
16:56 zerick joined #gluster
17:00 zaitcev joined #gluster
17:07 hagarth joined #gluster
17:18 hagarth joined #gluster
17:21 calum_ joined #gluster
17:28 guix joined #gluster
17:29 guix Hello, any gluster developper around here?
17:29 ncjohnsto joined #gluster
17:30 Remco They do hang around here, but they're not always active
17:31 guix ah ok.
17:32 guix I have a gluster question maybe they or anyone could help. I have gluster installed (version 3.3.2) and once in a while I get the message: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this  as if it looses connectivity to the mount (fuse) filesystem
17:32 guix wondering if this is a known issue and if it was fixed in a newer version of gluster
17:33 guix I am using it with Apache and when this happens my apache process falls in a D state waiting for disk
17:38 cyberbootje joined #gluster
17:42 chirino joined #gluster
17:57 vimal joined #gluster
18:02 _pol joined #gluster
18:02 guix FYI, I am on Ubuntu 12.04.1 LTS
18:03 semiosis guix: you say you get that message... but where?  can you put the whole log up on pastie.org or something so we can see?
18:04 bcdonadio joined #gluster
18:05 mtanner_ joined #gluster
18:05 bcdonadio My bricks aren't listening for NFS connections, altough I do have "nfs.disable: off". What am I missing?
18:05 bcdonadio s/bricks/nodes/g
18:05 glusterbot bcdonadio: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
18:05 guix_ joined #gluster
18:06 guix_ It is displayed in the syslogs. It is a Kernel stack trace.
18:10 mtanner joined #gluster
18:22 kPb_in_ joined #gluster
18:36 _pol joined #gluster
18:38 chirino joined #gluster
18:44 KORG joined #gluster
18:48 JoeJulian bcdonadio: "ps ax | grep nfs" to see if it's running. Make sure the ,,(ports) are open.
18:48 glusterbot bcdonadio: glusterd's management port is 24007/tcp and 24008/tcp if you use rdma. Bricks (glusterfsd) use 24009 & up for <3.4 and 49152 & up for 3.4. (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111 and 2049 since 3.4.
18:50 elyograg joined #gluster
18:52 elyograg ok, so I have this existing install with CentOS 6.  3.3.1-11 from kkeithley's repo.  I've been debating whether or not to go with 3.4 on the new servers and upgrading the rest later.  I've decided that the safe option is to just install kkiethley's repo on the new servers, which gets me 3.3.1-15.  Is the later build of the same version going to cause me any problems?
18:53 JoeJulian no
18:55 elyograg I do have to endure the pain of upgrading my network access servers (peers with no bricks) from CentOS 6.3 to 6.4, wherein pacemaker undergoes a change that's not compatible with servers running the older release.
18:55 elyograg shared IP for NFS/Samba.
18:57 elyograg A recent test has shown us that gluster via NFS performs about half as well as filesystems on fiberchannel->sata SANs shared via NFS.  We have no data on native mounts, can't do native mounts with solaris.
18:59 elyograg actually, it wasn't even really a test.  we had to gather and resize some images for a demo.  the images pulled from gluster went much slower.
19:09 JoeJulian So does that mean that the options are: run (expensive) solaris with very expensive emc/netapp/isilon or run linux?
19:10 kkeithley_ elyograg: the diff between the 3.3.1-11 and 3.3.1-15 are nit issues with packaging. The source itself is unchanged.
19:11 elyograg kkeithley_: thank you.
19:34 jbrooks joined #gluster
19:57 harish_ joined #gluster
20:02 ndk joined #gluster
20:10 jkarretero joined #gluster
20:11 Marian_spain joined #gluster
20:12 Marian_spain Hello everybody!
20:13 Marian_spain I get "Transport endpoint is not connected" when trying to move a folder.
20:14 Marian_spain Only when moving folders. I can mv files, create files and dirs, delete, ... without problems
20:15 Marian_spain When renaming a dir...
20:15 Marian_spain [2013-10-23 19:01:43.755690] I [dht-rename.c:275:dht_rename_dir] 0-storage-dht: one of the subvolumes down (storage-client-0)
20:15 Marian_spain [2013-10-23 19:01:43.755760] W [fuse-bridge.c:1620:fuse_rename_cbk] 0-glusterfs-fuse: 378000065: /home/jkarretero/.matlab -> /home/jkarretero/.matlab_ => -1 (Transport endpoint is not connected)
20:15 Marian_spain When renaming a dir...
20:15 Marian_spain [2013-10-23 19:01:43.755690] I [dht-rename.c:275:dht_rename_dir] 0-storage-dht: one of the subvolumes down (storage-client-0)
20:15 Marian_spain [2013-10-23 19:01:43.755760] W [fuse-bridge.c:1620:fuse_rename_cbk] 0-glusterfs-fuse: 378000065: /home/jkarretero/.matlab -> /home/jkarretero/.matlab_ => -1 (Transport endpoint is not connected)
20:16 Marian_spain When restarting glusterd service...
20:16 Marian_spain [2013-10-23 19:37:55.872528] W [socket.c:514:__socket_rwv] 0-glusterfs: readv failed (No data available)
20:16 Marian_spain [2013-10-23 19:37:55.872587] W [socket.c:1962:__socket_proto_state_machine] 0-glusterfs: reading from socket failed. Error (No data available), peer (192.168.0.1:24007)
20:16 Marian_spain any ideas?
20:17 bcdonadio Is 3.4 already recommended to be used in production, or should I stick with 3.3?
20:17 calum_ Would it be stupid to run gluster on xenserver with virtual harddrives? I guess it is best run on bare metal?
20:20 * Marian_spain slaps glusterbot around a bit with a large trout
20:20 bcdonadio complete stupidity, you would have no performance gain nor resilience (supposing you're using the same physical disk)
20:21 harish_ joined #gluster
20:24 semiosis calum_: i run gluster in ec2, which is xen
20:25 semiosis and my bricks are ebs volumes, which are "virtual harddrives"
20:26 semiosis Marian_spain: please ,,(pasteinfo)
20:26 glusterbot Marian_spain: Please paste the output of "gluster volume info" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
20:32 Marian_spain Volume Name: storage
20:32 Marian_spain Type: Distribute
20:32 Marian_spain Volume ID: a201cbc2-a727-47ef-a6ba-9fc4fba41b4a
20:32 Marian_spain Status: Started
20:32 Marian_spain Number of Bricks: 2
20:34 elyograg can't read instructions. ;)
20:34 semiosis srsly wtf
20:36 Marian_spain joined #gluster
20:37 Marian_spain sorry for flooding, will not happen again :)
20:40 Marian_spain Contents of storage-fuse.vol: http://fpaste.org/48998/
20:40 glusterbot Title: #48998 Fedora Project Pastebin (at fpaste.org)
20:40 semiosis Marian_spain: what version of glusterfs are you using?
20:40 Marian_spain glusterfs 3.4.0 built on Aug  6 2013 11:17:05
20:41 semiosis newer version is available, 3.4.1
20:42 Marian_spain I am going to try to update using yum
20:43 semiosis Marian_spain: check that you can telnet from client machine to the server julia on port 24007
20:43 semiosis 0-storage-dht: one of the subvolumes down (storage-client-0) --- this means the client cant connect to julia
20:43 pdrakewe_ joined #gluster
20:47 harish joined #gluster
20:47 johnbot11 joined #gluster
20:47 Marian_spain Telnet ok, you can see output here: http://fpaste.org/49002/
20:47 glusterbot Title: #49002 Fedora Project Pastebin (at fpaste.org)
20:48 johnbot11 joined #gluster
20:48 Marian_spain by the way, the client is julia itself
20:50 semiosis Marian_spain: check to make sure that... 1. the volume is started, and 2. there is a glusterfsd process (brick export daemon) running
20:50 semiosis could also use gluster volume status to check that, i think
20:55 Marian_spain volume is started and glusterfsd is running, you can see output of corresponding commands here: http://fpaste.org/49003/
20:55 glusterbot Title: #49003 Fedora Project Pastebin (at fpaste.org)
20:58 elyograg the fact that 127.0.0.1 resolved to julia sounds like /etc/hosts isn't right.  or at least it's not set up the way I would set it up.
21:00 Marian_spain yes. in fact, julia has multiple interfaces, you can see them here: http://fpaste.org/49004/13825620/
21:00 glusterbot Title: #49004 Fedora Project Pastebin (at fpaste.org)
21:01 Marian_spain They are for tcp and infiniband
21:02 Marian_spain 192.168.0.1 is for ethernet and 192.168.1.1 for infiniband
21:02 DV__ joined #gluster
21:03 elyograg IMHO, having multiple entries like that for a machine's own hostname is not a good idea.  and i'd remove it completely from the localhost lines.  I'd pick one to be the canonical hostname and give the other one a -something extension.
21:05 Marian_spain you are completely right. I would have not done it that way. I didn't configure that :D
21:06 elyograg here's the hosts file I've got on all my gluster servers, with the real domain name changed to example.com: http://fpaste.org/49006/56233313/
21:06 glusterbot Title: #49006 Fedora Project Pastebin (at fpaste.org)
21:07 Marian_spain it looks nice ;)
21:07 elyograg the servers themselves talk to each other via the 10.116 addresses.  the 10.108 addresses, known locally as -pub, are what DNS has for the actual hostnames.
21:08 elyograg so gluster's inter-server communication has a separate 1Gg/s network from where data transfer to clients takes palce.
21:09 Marian_spain I have tried changing order of lines so that julia is resolved differt ways, but always have same error when moving :/
21:13 Marian_spain The strange thing is that other clients (different to julia) don't have the problem of moving folders inside the gluster
21:13 semiosis interesting
21:14 semiosis did you unmount & remount the client after changing the hosts resolution order?
21:16 Marian_spain I restarted glusterd service
21:16 Marian_spain do you mean that?
21:20 harish joined #gluster
21:22 Marian_spain I am going to unmount & remount gluster volume after changing the hosts resolution order
21:27 harish joined #gluster
21:29 kkeithley_ JoeJulian: thanks for commenting on that BZ
21:31 JoeJulian You bet. I'm going to take a look at that tonight. I think I have an idea what's happening and how to fix it.
21:31 kkeithley_ cool
21:38 kkeithley_ and thanks
21:38 JoeJulian Any time
21:41 hagarth joined #gluster
21:45 dbruhn joined #gluster
21:46 dbruhn version 3.3.1 what is the option to set the max fill level on the bricks?
21:47 Marian_spain It looks like julia doesn't revive after rebooting. So I will continue other day. Thanks semiosis! Kisses xxx
21:49 JoeJulian dbruhn: yes...
21:49 JoeJulian hrhr
21:51 JoeJulian cluster.min-free-disk
21:51 dbruhn hey Joe, odd question here.
21:51 dbruhn If I do that will a rebalance cause the system to adhere to the max fill level
21:52 dbruhn I am running into an issue with files being written to after the fact over filling bricks to 100 and then getting a device doesn't have enough free space error
21:52 JoeJulian I think it will if you use "force" at the end of the rebalance command.
21:52 dbruhn Ok, I will report back on that
21:53 dbruhn btw thanks again for all that help Friday night/.saturday morning
21:53 dbruhn I owe you beer or something equally as tasty
21:54 JoeJulian You're welcome. My wife was off doing girl things and my daughter was asleep, so I just sat enjoying my hobby. :)
21:55 dbruhn I ended up having another issue after I cleaned all that up with one of the bricks the link file in the .gluster directory that is linked to ../../.. somehow turned into a directory
21:55 dbruhn that was weird and causing issues too
21:56 JoeJulian dbruhn: I've seen that sometimes, too. I filed a bug but I think it was closed as "cannot repro".
21:57 dbruhn It caused a bunch of files to show up in directories twice with the same inode
21:58 JoeJulian Make root show up in the heal info list as well.
21:58 JoeJulian s/Make/Makes/
21:58 glusterbot What JoeJulian meant to say was: Makes root show up in the heal info list as well.
21:59 dbruhn Yeah I saw that, super weird
21:59 dbruhn glad I got it all fixed, the system started running so much better after that
22:12 ctria joined #gluster
22:28 klaxa joined #gluster
22:47 joshcarter joined #gluster
22:55 bstr joined #gluster
23:00 ncjohnsto joined #gluster
23:06 tryggvil joined #gluster
23:15 harish joined #gluster
23:38 F^nor joined #gluster
23:41 _pol_ joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary