Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2013-11-26

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 kobiyashi quick fix was to ls the apache fuse mount to the file and sometimes i get a structure needs cleaning and sometimes i don't
00:01 kobiyashi but once i do that, the changes and access to the content via the browser is OK
00:02 JoeJulian ~brick order | Gutleib
00:02 glusterbot Gutleib: Replicas are defined in the order bricks are listed in the volume create command. So gluster volume create myvol replica 2 server1:/data/brick1 server2:/data/brick1 server3:/data/brick1 server4:/data/brick1 will replicate between server1 and server2 and replicate between server3 and server4.
00:02 bigclouds_ joined #gluster
00:03 Gutleib seen that, double-checked /etc/hosts, that's different servers
00:05 Gutleib nevermind, I typo'd
00:05 Gutleib thx!
00:13 keytab joined #gluster
00:14 Kins joined #gluster
00:16 Gutleib and one more theoretical question: say, I have 2 servers on different sites. Can I use gluster for syncing lots of small files? I know that is not recommended now _in general_, but how can I cache file listings and make replication asynchronous? I'm ok with "eventually consistent" state, files are 95% read, 5% written 0% changed.
00:17 chirino joined #gluster
00:22 Technicool joined #gluster
00:24 johnbot11 joined #gluster
00:25 JoeJulian geo-synchronization
00:26 JoeJulian Though that's unidirectional.
00:27 JoeJulian If you can send your writes to a different directory, you could mount your remote volume for your few writes, then geo-sync them back for quick readability.
00:34 chirino joined #gluster
00:39 dbruhn joined #gluster
00:42 Gutleib joined #gluster
00:43 Gutleib readonly is not an option, sadly
00:46 _polto_ joined #gluster
00:49 Gutleib oh, well/
00:49 Gutleib will test
00:50 Gutleib thx, bye!
00:54 bigclouds_ hi
00:54 glusterbot bigclouds_: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
01:00 msolo joined #gluster
01:01 msolo I'm seeing an error in the logs I can't figure out how to resolve:
01:01 msolo Invalid op operation doesn't have a task_id"
01:01 dbruhn What are the errors
01:02 msolo Failed to add task details to dict
01:02 dbruhn What are you trying to do while it's throwing those errors?
01:02 msolo volume status
01:02 dbruhn version?
01:03 msolo 3.4
01:03 JoeJulian I would guess there's a version mismatch somewhere.
01:04 msolo some system not on 3.4 you mean?
01:05 JoeJulian _add_task_to_dict expects there to be a task_id of GD_OP_REMOVE_BRICK, GD_OP_REBALANCE, or GD_OP_REPLACE_BRICK. The operation has none of those implying that some glusterd that's running is not the same version as the other(s).
01:07 msolo Hmm, not likely. glusterfs --version returns 3.4.1 on all machines
01:07 dbruhn what about glusterd
01:07 JoeJulian restart glusterd
01:08 JoeJulian Off to my train... ttfn.
01:08 msolo ok, the restart fixed it
01:09 msolo but that doesn't explain how it got into that state
01:09 msolo what more can I look ag?
01:09 msolo s/ag/at/
01:09 JoeJulian You upgraded but never restarted.
01:09 glusterbot msolo: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
01:10 dbruhn ugh, my damn laptop drive is corrupting all of my crap... SO LAME
01:10 JoeJulian at least that's my guess and the only thing I can think of that would show the right version but not produce the right structures.
01:11 msolo Let's hope it was an operator error. I only started with gluster on Saturday and I never installed 3.3 at any time. The machines are only 5 days old.
01:11 msolo Still, I'll rerun my whole bootstrap again
01:11 msolo and see if it wedges again
01:11 msolo i was testing pulling the power on a brick
01:11 dbruhn Never had 3.3 on those servers?
01:12 msolo Don't think so. I had the newer PPA in my Ubuntu kickstart file.
01:12 msolo Plus, most everything else has been working up to now
01:12 msolo I healed, failed a brick or two
01:13 msolo I was pretty careful, and my control script checks the installed version across the fleet
01:14 msolo thanks for the quick advice
01:19 bigclouds_ joined #gluster
01:20 _pol joined #gluster
01:22 bigclouds_ after i mount glustefs, it reported 'invalid argument' when i mkdir
01:23 bigclouds_ it is good last time
01:31 gmcwhistler joined #gluster
01:33 jag3773 joined #gluster
01:36 davidbierce joined #gluster
01:45 gmcwhistler joined #gluster
01:51 bala joined #gluster
01:51 gmcwhistler joined #gluster
02:07 harish joined #gluster
02:10 bigclouds_ joined #gluster
02:17 PM1976 joined #gluster
02:17 PM1976 Hi All
02:20 PM1976 As people such as JoeJulian have been really kind to help me until now, I will share you what I applied to my gluster configuration, in case it can help others :)
02:21 PM1976 for the log rotation (gluster, Bricks and geo-replication), I created a file under /etc/logrotate.d/ with the following inside:
02:21 PM1976 /var/log/glusterfs/*.log /var/log/glusterfs/*/*.log /var/log/glusterfs/*/*/*.log {
02:21 PM1976 daily
02:21 PM1976 copytruncate
02:21 PM1976 rotate 7
02:21 PM1976 compress
02:21 PM1976 postrotate
02:21 PM1976 /usr/bin/killall -HUP glusterfsd
02:21 PM1976 endscript
02:21 PM1976 }
02:21 PM1976 this allows all my logs to rotate properly and with just a single setting.
02:21 PM1976 It runs properly like this for a few days already.
02:24 PM1976 For the issue to exclude a specific extention from the geo-replication (in my case, a WinSCP *.filepart extension), I modified the following line in /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py:
02:24 PM1976 op.add_option('--rsync-options',       metavar='OPTS',  default='--exclude *.filepart --sparse')
02:25 PM1976 doing this, it will start the geo-replication only when the file (some are 300 GB) only when it has been fully transfered in the source gluster.
02:29 bulde joined #gluster
02:37 PM1976 Now, I am trying to acheive another gaol in the geo-replication: using UDR instead of rsync alone to boost the geo-replication, as I was able to get average performance 4 time faster (about 250 Mbps on my 1 Gbps link with 250ms latency)
02:37 PM1976 if anyone has idea to use UDR, feel free to let me know
02:44 bigclouds_ joined #gluster
02:52 glusted joined #gluster
02:53 bharata-rao joined #gluster
02:53 glusted Moiderators!!!
02:53 glusted moderators, show up quickly thanks-
02:55 glusted joined #gluster
02:57 mohankumar joined #gluster
03:00 masterzen joined #gluster
03:02 kshlm joined #gluster
03:02 glusted losers.
03:07 jag3773 joined #gluster
03:10 raghug joined #gluster
03:13 pdrakeweb joined #gluster
03:20 purpleidea PM1976: (,,paste)
03:21 purpleidea @paste ~ PM1976
03:21 purpleidea ~paste | PM1976
03:21 glusterbot PM1976: For RPM based distros you can yum install fpaste, for debian and ubuntu it's pastebinit. Then you can easily pipe command output to [f] paste [binit] and it'll give you a URL.
03:21 purpleidea PM1976: don't paste in channel :P
03:22 micu joined #gluster
03:36 shubhendu joined #gluster
03:37 ppai joined #gluster
03:37 bulde joined #gluster
03:39 Alex Are many people using Unicode on Gluster? I'm still seeing more issues than just the ones described in #1024181 -- with a 3.4.55 kernel, I'm actually unable to even see the files.
03:41 davinder joined #gluster
03:42 DV__ joined #gluster
03:42 vpshastry joined #gluster
03:43 Alex ie, 'find /data*' shows up the files on the underlying bricks, but find across the volume shows no files at all (e.g. https://gist.github.com/a204532fe1e4fdb7c2c1)
03:43 glusterbot Title: gist:a204532fe1e4fdb7c2c1 (at gist.github.com)
03:45 RameshN joined #gluster
03:45 itisravi joined #gluster
03:46 bigclouds_ who could answer my question? thanks
04:02 PM1976 purpleidea: thanks for the info
04:02 PM1976 :)
04:28 vshankar joined #gluster
04:29 shylesh joined #gluster
04:30 ngoswami joined #gluster
04:33 ricky-ti1 joined #gluster
04:33 MiteshShah joined #gluster
04:34 satheesh joined #gluster
04:39 chirino joined #gluster
04:50 shruti joined #gluster
04:54 dusmant joined #gluster
04:54 ababu joined #gluster
05:00 lalatenduM joined #gluster
05:02 bala joined #gluster
05:07 _dist joined #gluster
05:09 TDJACR joined #gluster
05:11 bigclouds joined #gluster
05:15 saurabh joined #gluster
05:15 sgowda joined #gluster
05:16 bigclouds joined #gluster
05:18 kanagaraj joined #gluster
05:22 meghanam joined #gluster
05:22 meghanam_ joined #gluster
05:25 DV joined #gluster
05:37 CheRi joined #gluster
05:40 psharma joined #gluster
05:44 bulde joined #gluster
05:48 raghu joined #gluster
06:06 rastar joined #gluster
06:10 _polto_ joined #gluster
06:11 nshaikh joined #gluster
06:12 shruti joined #gluster
06:13 ngoswami joined #gluster
06:16 ndarshan joined #gluster
06:20 chirino joined #gluster
06:26 geewiz joined #gluster
06:27 glusterbot New news from newglusterbugs: [Bug 990028] enable gfid to path conversion <http://goo.gl/1HwiQc> || [Bug 969461] RFE: Quota fixes <http://goo.gl/XFSM4>
06:38 krypto joined #gluster
06:41 anands joined #gluster
06:44 T0aD hi
06:44 glusterbot T0aD: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
06:44 T0aD wow!
06:44 T0aD is gluster massive quotas ready for testing ? :)
06:46 davinder joined #gluster
06:50 RameshN joined #gluster
06:52 hagarth joined #gluster
07:03 chirino joined #gluster
07:04 kevein joined #gluster
07:06 sprachgenerator joined #gluster
07:19 ctria joined #gluster
07:22 _dist left #gluster
07:27 jtux joined #gluster
07:34 hagarth joined #gluster
07:36 shri_ joined #gluster
07:43 rjoseph joined #gluster
07:46 keytab joined #gluster
07:48 ekuric joined #gluster
07:54 krypto joined #gluster
07:55 hagarth joined #gluster
07:59 eseyman joined #gluster
08:02 geewiz joined #gluster
08:11 andreask joined #gluster
08:14 franc joined #gluster
08:14 franc joined #gluster
08:19 khushildep joined #gluster
08:28 glusterbot New news from newglusterbugs: [Bug 962226] 'prove' tests failures <http://goo.gl/J2qCz>
08:32 vimal joined #gluster
08:37 satheesh1 joined #gluster
08:37 franc joined #gluster
08:40 glusterbot New news from resolvedglusterbugs: [Bug 948729] gluster volume create command creates brick directory in / of storage node if the specified directory does not exist <http://goo.gl/Og5Sf>
08:43 calum_ joined #gluster
08:44 vpshastry joined #gluster
08:51 mgebbe_ joined #gluster
09:03 harish joined #gluster
09:08 getup- joined #gluster
09:22 MiteshShah joined #gluster
09:27 gdubreui joined #gluster
09:32 bp_ Hi. I'm running Debian Wheezy and want to install QEMU 1.3 with glusterfs support. The build dependency for glusterfs support is having a successful "#include <glusterfs/api/glfs.h>"
09:33 bp_ ...so I cloned the gluster repos, checked out v3.2.7 and... I can't find a way to build the api directory
09:33 bp_ it seems that ./configure never turns the Makefile.in file in there into a Makefile proper.
09:33 Remco bp_: If at all possible, go with gluster 3.4.x
09:34 ngoswami joined #gluster
09:36 vshankar joined #gluster
09:52 rastar joined #gluster
09:52 hagarth joined #gluster
09:54 _polto_ joined #gluster
09:58 glusterbot New news from newglusterbugs: [Bug 917272] program to just set trusted.glusterfs.volume-id, for running glusterfs standalone <http://goo.gl/tuuHZ>
09:59 ngoswami joined #gluster
10:02 bp_ Remco: that helped, thank you.
10:05 vpshastry joined #gluster
10:09 shri joined #gluster
10:10 shri hagarth: Hi...
10:11 hagarth shri: hello
10:11 coxy23 joined #gluster
10:12 franc joined #gluster
10:13 shri hagarth: did you get any chance for devstack ?
10:17 hagarth shri: running into problems with rabbit in devstack..
10:18 franc joined #gluster
10:28 dusmant joined #gluster
10:35 shri Ohhhhhhhhhh
10:36 shri hagarth: disable RabbitMQ & MySQL
10:36 shri hagarth: use postgresql & qpid
10:37 shri hagarth:  add below things in localrc & also do similar changes in stack.sh
10:37 shri export disable_service rabbit
10:37 shri export enable_service qpid
10:37 shri export disable_service mysql
10:37 shri export enable_service postgresql
10:37 shri hagarth: by default devstacktakes RabbitMQ and it hangs openstack commands
10:37 hagarth shri: mysql is working fine for me
10:37 hagarth shri: will try that out, thanks
10:37 shri hagarth: ohh on my setup mysql is not starting so..
10:38 shri hagarth: also...I have started debugging nova boot command control flow
10:38 shri hagarth: but found that it won't invoke that virt/libvirt/volume.py which has gluster related changes
10:39 shri hagarth: so investigating the things...
10:39 hagarth shri: interesting .. can you share that control flow log with me?
10:39 shri hagarth: yeah I will check I have save..
10:40 shri hagarth: I did that through pdb.. and for few  functions like get_image_id, get_volume etc etc I have not debugged these function in detail as there are returning ID so..
10:43 shri hagarth: How  can share my control flow log ? where I can upload it ?
10:45 vpshastry joined #gluster
10:47 RameshN joined #gluster
10:48 lalatenduM joined #gluster
10:49 hagarth shri: you can mail me too .. how big is the file?
10:50 shri hagarth: it's small..  40K
10:50 hagarth shri: i think email will work fine
10:50 shri hagarth: I checked function calls & python script calls from PDB debugger
10:50 shri hagarth: sure.. Thanks !
10:58 andreask joined #gluster
10:59 calum_ joined #gluster
11:01 franc joined #gluster
11:10 ppai joined #gluster
11:13 davinder joined #gluster
11:16 khushildep joined #gluster
11:28 glusterbot New news from newglusterbugs: [Bug 1032859] readlink returns EINVAL <http://goo.gl/ojGr7c>
11:28 muhh joined #gluster
11:30 spandit joined #gluster
11:30 vpshastry joined #gluster
11:46 nullck joined #gluster
11:46 andreask1 joined #gluster
11:47 kkeithley1 joined #gluster
11:48 StarBeast joined #gluster
11:58 ndarshan joined #gluster
11:58 glusterbot New news from newglusterbugs: [Bug 1034489] SELinux is preventing /usr/sbin/glusterfsd from 'create' accesses on the fifo_file fifo. <http://goo.gl/r8vdIx>
11:58 CheRi joined #gluster
12:00 lpabon joined #gluster
12:05 shyam joined #gluster
12:07 StarBeast joined #gluster
12:09 kkeihthle_ ,,(bug)
12:09 glusterbot I do not know about 'bug', but I do know about these similar topics: 'fileabug'
12:09 kkeihthle_ ,,(fileabug)
12:09 glusterbot Please file a bug at http://goo.gl/UUuCq
12:09 itisravi joined #gluster
12:21 rcheleguini joined #gluster
12:23 andreask joined #gluster
12:25 ppai joined #gluster
12:26 anands joined #gluster
12:27 jskinner joined #gluster
12:29 social kkeithley_: ping https://bugzilla.redhat.com/show_bug.cgi?id=1033576
12:29 glusterbot <http://goo.gl/tW3gtb> (at bugzilla.redhat.com)
12:29 glusterbot Bug 1033576: unspecified, high, ---, sgowda, NEW , rm: cannot remove  Directory not empty on path that should be clean already
12:30 social kkeithley_: I'd really love to understand how 1f7dadccd45863ebea8f60339f297ac551e89899 breaks it and how b13c483dca20e4015b958f8959328e665a357f60 fixes it :/
12:35 davinder2 joined #gluster
12:47 rastar joined #gluster
12:48 shri joined #gluster
12:56 getup- joined #gluster
12:57 getup- hi, when i'm writing files over the fuse mount point and one of the nodes fails, there is about a minute timeout before it starts working again (that one write takes a minute). Which timeout am I looking for here?
13:03 Remco network.ping-timeout
13:03 Remco (IIRC)
13:04 haritsu joined #gluster
13:12 raghug joined #gluster
13:18 ndarshan joined #gluster
13:20 getup- Remco: that seems to be the one, thanks
13:42 kkeithley_ social: good question. Just based on the desriptions of those commits I wouldn't think they'd be at all related
13:44 social kkeithley_: well note that the georeplication is required part for reproducing the issues
13:47 davidbierce joined #gluster
13:47 shri hagarth: Hi.. sent the logs file..
13:48 shri hagarth: plz check the mail
13:50 hagarth shri: thanks for that! will do and get back to you
13:50 dusmant joined #gluster
13:51 shri hagarth: yup.. Thanks !!
13:56 bennyturns joined #gluster
13:58 sprachgenerator joined #gluster
14:02 calum_ joined #gluster
14:14 getup- lets assume a node fails in a 2 node replicate set, whats the best way to monitor from the running node that the other one failed? i tried a volume status but that takes roughly 2 minutes, any quicker routes?
14:22 pkoro joined #gluster
14:24 dbruhn joined #gluster
14:26 hurl joined #gluster
14:31 hurl hi all. I'm having some errors in logs but I'm unable to guess what does it means. If someone has some spare time :)
14:31 dbruhn Put the error into fpaste and lets take a look
14:32 hurl hi dbruhn. ok wait a second
14:32 hurl here  it is http://ur1.ca/g3nm1
14:32 glusterbot Title: #56866 Fedora Project Pastebin (at ur1.ca)
14:32 bala joined #gluster
14:33 chirino joined #gluster
14:33 dbruhn so what you experiencing when you see these errors?
14:34 geewiz joined #gluster
14:34 hurl nothing :) I've setup a replicated volume, with 2 bricks. I've removed 1 brick for migration purpose. Don't know if this error raised after that or not
14:34 an joined #gluster
14:34 dbruhn Are all of your peers connected and all of your bricks available?
14:34 hurl vz containers are stored on the mounted glusterfs volume (fuse). However I've experienced a "bus error" in a VZ; don't know too if there"s a relation
14:35 _polto_ joined #gluster
14:35 hurl the second peer is disconnected; the bricks were removed too
14:36 hurl the second brick was removed, sorry
14:36 dbruhn But your volume is still setup up as a replica 2?
14:36 davinder joined #gluster
14:36 hurl no, distribute
14:37 hurl I can't remove a brick without doing a "replica 1"
14:37 hurl and I guess it change the volume type from replica to distribute
14:37 dbruhn do me a favor and run a gluster volume info and put it on fpaste
14:37 hurl basically, my purpose was to test if I can create a replica 2 volume, remove 1 brick, and add it later
14:38 dbruhn you can change the replica count, I know that much, just not sure about how to actually execute it, and haven't tested it before
14:38 hurl http://www.fpaste.org/56870/47670813/
14:38 glusterbot Title: #56870 Fedora Project Pastebin (at www.fpaste.org)
14:39 dbruhn Looks like it indeed as gone to a DHT volume
14:39 hurl yes it seems
14:39 dbruhn what about the output for gluster peer status?
14:40 hurl http://www.fpaste.org/56872/38547680/
14:40 glusterbot Title: #56872 Fedora Project Pastebin (at www.fpaste.org)
14:42 dbruhn Has your client been connected since before you removed the second server/brick?
14:42 hurl yes
14:43 dbruhn try disconnecting and reconnecting the client
14:43 dbruhn and see of the error goes away
14:43 hurl i've just shutdown the service on node2
14:44 hurl is there a "chance" that it will touch data on node3 ?
14:44 hurl (the running one)
14:44 dbruhn When the gluster client connects to a single server, that server returns all of the servers and bricks it can connect to, and the client connects to all of the bricks. I'm not sure this would be updated after connection
14:44 hybrid5121 joined #gluster
14:45 dbruhn what do you mean touch? by un-mounting and remounting?
14:45 haritsu joined #gluster
14:45 dbruhn Also your one peer is still showing up disconnected in the peer status
14:46 hurl not mounting, but kind of sync or I don't know. If I understand correctly, the volume is on DHT now, with just 1 peer, right ?
14:46 dbruhn if you really want it removed you will need to run "gluster peer detach server"
14:46 gmcwhistler joined #gluster
14:47 dbruhn You are correct, the second peer is still part of your storage pool though
14:47 dbruhn did you change the replication setting before or after you downed the second server?
14:47 hurl ok. the fact is I've disconnected 1 node until I reorganize it's partition table.
14:48 hurl yes, I've runned a "remove-brick replica 1 node2:…"
14:48 dbruhn ok, if you use that peer remove command, the system will stop trying to connect to it as well
14:48 jskinner_ joined #gluster
14:49 dbruhn even though it's not hosting a brick, it is still part of your trusted storage pool
14:49 hurl ok, so the errors in logs are basiscally just saying that peer2 can't be reached ?
14:49 shyam joined #gluster
14:49 dbruhn The first one looks like it is saying the client can reach it, and the second one looks like it is saying the management network can't reach it.
14:50 dbruhn at least from a quick review
14:50 hurl so with my current setup it's "normal" ?
14:50 hurl the important question is "are this errors a problem for my data ?"
14:50 dbruhn I guess that much I don't know
14:51 dbruhn to be honest
14:51 dbruhn If you are concerned about it, a backup might be in order before you proceed to make any more changes
14:51 failshell joined #gluster
14:51 hurl of course.
14:51 bala joined #gluster
14:51 hurl I've tested it before and it seems ok, but I always like a confirmation when playing with datas :)
14:52 dbruhn Is this a production system?
14:52 hurl yes
14:53 dbruhn This blog will give you a little insight into a better way to go about what you are trying to do.
14:53 dbruhn http://joejulian.name/blog/replacing-a-glusterfs-server-best-practice/
14:53 glusterbot <http://goo.gl/pwTHN> (at joejulian.name)
14:54 haritsu_ joined #gluster
14:54 hurl I've done basic test like remove-brick, creating files, add-brick replica 2, test if files sync etc...
14:56 hurl thanks a lot dbruhn
14:56 dbruhn No problem, hope I was of some help.
14:56 hurl I've also read an article on this blog about adding replica brick and using replace-brick; but to be honest I've not really understood all the magic
14:57 hurl Yes you were :)
14:57 _BryanHm_ joined #gluster
14:57 dbruhn That JoeJulian blog is a great resource, keep it bookmarked.
14:58 calum_ joined #gluster
14:59 y4m4_ joined #gluster
15:04 mattapp__ joined #gluster
15:05 satheesh1 joined #gluster
15:07 johnmark dbruhn: +1
15:10 zerick joined #gluster
15:13 klaas joined #gluster
15:20 zerick joined #gluster
15:21 vpshastry joined #gluster
15:22 dneary_ joined #gluster
15:26 wushudoin| joined #gluster
15:30 ira joined #gluster
15:32 vshankar joined #gluster
15:41 calum_ joined #gluster
15:47 haritsu joined #gluster
15:50 jag3773 joined #gluster
16:00 social a2: is it possible to create backport of https://bugzilla.redhat.com/show_bug.cgi?id=847839 at least up to b13c483dca20e4015b958f8959328e665a357f60 to 3.4.1? It's because of https://bugzilla.redhat.com/show_bug.cgi?id=1033576 (I created some backport patches but they are in state of "somehow works")
16:00 glusterbot <http://goo.gl/l4Gw2> (at bugzilla.redhat.com)
16:00 glusterbot Bug 847839: unspecified, unspecified, ---, csaba, ASSIGNED , [FEAT] Distributed geo-replication
16:00 glusterbot Bug 1033576: unspecified, high, ---, sgowda, NEW , rm: cannot remove  Directory not empty on path that should be clean already
16:01 japuzzo joined #gluster
16:03 dbruhn Is there anyway to change the transport type on a volume?
16:04 kobiyashi can you rename a volume after its been created?
16:07 KORG joined #gluster
16:07 japuzzo joined #gluster
16:08 dewey joined #gluster
16:10 _polto_ joined #gluster
16:10 _polto_ joined #gluster
16:16 hagarth joined #gluster
16:21 kobiyashi i'm trying to rsync data to my new gluster distr/rep 4 nodes and this is what i see in the logs
16:21 kobiyashi W [marker-quota.c:2039:mq_inspect_directory_xattr] 0-devstatic-marker: cannot add a new contribution node
16:21 kobiyashi should i be concerned?
16:26 semiosis JoeJulian: ping
16:30 vpshastry left #gluster
16:31 raghug joined #gluster
16:35 Technicool joined #gluster
16:35 Mo__ joined #gluster
16:39 Technicool joined #gluster
16:42 rotbeard joined #gluster
16:46 kaptk2 joined #gluster
16:50 ira joined #gluster
16:52 andreask joined #gluster
16:55 hurl left #gluster
16:58 johnbot11 joined #gluster
17:02 _polto_ joined #gluster
17:02 _polto_ joined #gluster
17:03 an joined #gluster
17:03 plarsen joined #gluster
17:09 elyograg One of our devs suspects that we might have run into bug 820518 ... but the bug is extremely unclear about how to reproduce.  It says they brought the brick down and then brought it back up ... but doesn't say HOW they did this.
17:09 glusterbot Bug http://goo.gl/LUq7S medium, medium, ---, pkarampu, CLOSED DUPLICATE, Issues with rebalance and self heal going simultanously
17:13 aliguori joined #gluster
17:13 sroy__ joined #gluster
17:22 [o__o] left #gluster
17:25 [o__o] joined #gluster
17:26 jbd1 joined #gluster
17:35 dusmant joined #gluster
17:37 sprachgenerator joined #gluster
17:40 dusmant joined #gluster
17:43 [o__o] left #gluster
17:43 bstr Hey guys, im running a two node gluster volume (replica) on RHEL 6.4 (gluster 3.4.1). But im noticing when i reboot one of the boxes the gluster fuse mount becomes unavil. while the one node is down. Even more odd, after this happened, it looks like the brick on the node that remained online somehow went offline, and does node indicate a PID via gluster volume status
17:44 bstr could this be a quorum setting? I haven't changed from default
17:44 samppah bstr: does the volume stay offline forever?
17:45 samppah it should be availanle aftar network.ping-timeout which is 42 seconds by default
17:45 [o__o] joined #gluster
17:46 EWDurbin joined #gluster
17:46 EWDurbin anything i can do to speed up failover when mounting with the gluster native client?
17:47 bstr been offline for 12 minutes
17:47 vimal joined #gluster
17:48 [o__o] left #gluster
17:50 [o__o] joined #gluster
17:50 semiosis bstr: need to check your client mount log file.  maybe your client was only connected to one server all along?
17:52 haritsu_ joined #gluster
17:52 semiosis EWDurbin: you can configure the volume's ping-timeout
17:52 semiosis but that's not a mount-time option
17:53 bstr thats what i initially thought as well, but im mounting via 'localhost:replicated'. looking at the logs im seeing a bunch of readv failed (no data available) now http://hastebin.com/godemocela.axapta
17:53 glusterbot Title: hastebin (at hastebin.com)
17:54 semiosis bstr: indeed.  client never connected to one of the bricks.  check the log file on the server for that brick
17:55 EWDurbin semiosis: that's a per volume config? awesome, thanks for the tap in the right direction
17:55 haritsu joined #gluster
18:01 andreask joined #gluster
18:01 bstr semiosis : yea something is definitely not right here, im seeing the symlink errors floding the logs and it looks like it never recovered http://hastebin.com/danusayuki.coffee
18:01 glusterbot Title: hastebin (at hastebin.com)
18:01 bstr i was seeing some weird metadata issues with this box while setting it up last night, but a reboot seemed to fix that (or at least i thought it did)
18:02 bstr im running cobbler on these two nodes running a bind mount from /etc/cobbler/ -> /data/replicated/cobbler-etc (fuse-mount), is this supported with gluster?
18:04 Gilbs1 joined #gluster
18:12 pravka joined #gluster
18:16 _pol joined #gluster
18:42 glusterbot New news from resolvedglusterbugs: [Bug 953694] Requirements of Samba VFS plugin for glusterfs <http://goo.gl/v7g29>
18:48 sprachgenerator_ joined #gluster
18:50 Xunil left #gluster
19:02 mattapp__ joined #gluster
19:09 _dist joined #gluster
19:10 davidbierce joined #gluster
19:10 cfeller joined #gluster
19:10 _dist I was hoping someone had the time to talk about two areas, 1) self healing on VMs (how do I know which is healthy) and 2) analyzing the results of volume profile
19:12 davidbie_ joined #gluster
19:13 _dist the first question is, if when VMs are running a replica, you will always see (as a fundemental part of the way it works, I've been told) all VMs in the gluster volume heal x info command. How can I know which one's aren't _actually_ healing
19:13 _dist running on*
19:16 mattapp__ joined #gluster
19:22 _dist so last reiteration (then I'll wait) :) - watch -n1 gluster volume heal x info - will alternatively show all my vm disks for random periods of time, if they are on. How do I know which ones are healthy?
19:24 y4m4_ joined #gluster
19:32 glusterbot New news from newglusterbugs: [Bug 1009134] sequential read performance not optimized for libgfapi <http://goo.gl/K8j2w2>
19:42 chirino joined #gluster
19:43 kobiyashi if i upgraded to glusterfs3.4.1-3 should that not be reflected in the logs?  I [client-handshake.c:1658:select_server_supported_programs] 0-devstatic-client-3: Using Program GlusterFS 3.3, Num (1298437), Version (330)
19:43 kobiyashi i'm not quite understanding what the glustershd.log is indicating
19:45 kobiyashi @_dist I think to know which is healthy is from 'gluster volume heal <vol> info' = 0 files for all nodes
19:45 kobiyashi i just saw this from a news letter https://access.redhat.com/site/sites/default/files/attachments/rhstorage_split-brain_20131120_0.pdf
19:45 glusterbot <http://goo.gl/gFFY6o> (at access.redhat.com)
20:01 gdubreui joined #gluster
20:02 glusterbot New news from newglusterbugs: [Bug 955548] adding host uuids to volume status command xml output <http://goo.gl/rZS9c>
20:02 _dist @kobiyashi while that's true, I've been told that due to the nature of many writes to massive files gluster will show items in the heal <vol> info (until they are synced) despite the fact that I'm running a sync replica
20:03 _dist nothing is shown in the log, nothing is shown in the history of gluster volume heal <vol> info healed
20:03 _dist this troubles me only because I'm confused by it, so I'm looking for clarification
20:11 jag3773 joined #gluster
20:16 _dist btw I am running gluster 3.4.1 @ oct 15th
20:21 mattapp__ joined #gluster
20:23 mattapp__ joined #gluster
20:23 elyograg does anyone know how to bring down a brick, as described in bug 820518 ?
20:24 glusterbot Bug http://goo.gl/LUq7S medium, medium, ---, pkarampu, CLOSED DUPLICATE, Issues with rebalance and self heal going simultanously
20:29 EWDurbin left #gluster
20:32 cogsu joined #gluster
20:42 mattapp__ joined #gluster
20:54 jag3773 joined #gluster
21:08 lpabon joined #gluster
21:08 JoeJulian ~extended attributes | _dist
21:08 glusterbot _dist: (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://goo.gl/Bf9Er
21:08 JoeJulian _dist: read #2 to see how that works.
21:23 dbruhn Is there something wrong with the 3.3.2 repo?
21:23 dbruhn http://download.gluster.org/pub/gluster/glusterfs/3.3/3.3.2/RHEL/
21:23 glusterbot <http://goo.gl/pAbb9G> (at download.gluster.org)
21:24 JoeJulian In what way?
21:25 dbruhn I have added the 3.3.2 repo, and for some reason can't install 3.3.2 at all using yum. I can manually download the packages and install them via RPM, but when I try and update my server it crabs about only 3.40 and 3.2.7 being available
21:25 dbruhn was trying to use versionlock to get around it, but no avail
21:25 dbruhn the repo is installed and enabled according to yum
21:27 JoeJulian Whis distro is this?
21:27 dbruhn redhat 6.4
21:30 dbruhn [root@ENTSNV06002EP etc]# cat redhat-release
21:30 dbruhn Red Hat Enterprise Linux Workstation release 6.4 (Santiago)
21:31 JoeJulian So it's getting 3.2.7 from epel... not sure where it's coming up with 3.4.0...
21:31 _pol_ joined #gluster
21:33 dbruhn http://www.fpaste.org/56999/85501579/
21:33 glusterbot Title: #56999 Fedora Project Pastebin (at www.fpaste.org)
21:33 samppah probably from red hat's own repositories
21:33 JoeJulian Oh... right...
21:33 dbruhn lol
21:33 dbruhn http://www.fpaste.org/57000/38550163/
21:33 glusterbot Title: #57000 Fedora Project Pastebin (at www.fpaste.org)
21:34 dbruhn do I need to blacklist it from some of the repos?
21:35 JoeJulian probably.
21:35 samppah dbruhn: uhm, it says that 3.3.2 is installed. what you are trying to do? yum update?
21:36 dbruhn yeah, I downloaded the packages manually from the repo and installed them in a half assed attempt to get it setup
21:36 dbruhn but now yum is crabbing about the version
21:36 samppah okay
21:36 dbruhn and version lock won't lock them, I am assuming because they were installed manually
21:37 dbruhn or because the repos aren't returning the correct info for the version
21:38 samppah red hat doesn't provide glusterfs-server package so that may be causing problems
21:39 samppah glusterfs-server-3.3.2 requires glusterfs-3.3.2 which is updated by glusterfs-3.4.0rhs and there is no update available for glusterfs-server
21:39 samppah no clue why versionlock doesn't work
21:40 JoeJulian My guess would be to exclude=glusterfs* from rhel-x86_64-workstation-6
21:40 dbruhn kk
21:40 dbruhn I'll give this a whirl
21:40 _dist JoeJulian: I have read the how it works, I understand how to run the profiling, but I am wondering what some expected metrics should be, if us = microseconds I'm seeing peak write latency of as much as 1-2 seconds and average around 23k us is that normal over 10gbe ?
21:42 JoeJulian _dist: I'm not sure we're talking about the same thing. I'm referring to files showing up in "heal info".
21:42 _dist oh, sorry I had two questions way up I assumed when you mean #2 you meant my profiling question
21:43 JoeJulian Sorry, I didn't get scrolled back that far. I'm having a lot of toddler interruptions today... :/
21:44 _dist so just to be clear the heal info shows files that are being written to rapidly because of latency? and getfattr is the only way to really know if the file is up to date on the brick you are looking at ?
21:45 JoeJulian Using getfattr you'll still see pending values occasionally. It marks it pending, does the operation, then clears the attribute. There's always a moment when you'll be able to see that.
21:46 JoeJulian Since there's no timestamp associated with those, there's no way of knowing if they're recent enough to be considered "safe"... Though now that I've said that, you could probably use mtime...
21:46 _dist I realize my testing is _very_ aggresive on the system, using a replica 2 volume I'm frequently taking one volume down, tweaking the fs and bringing it back up
21:47 _dist so if I take system a down for changes, bring it back up and migrate it my VMs over to it from B (to make changes to B) it's hard to know when A isn't getting some of its' data from B still (if you follow what I'm saying)
21:48 JoeJulian I do.
21:48 nonsenso i just derp'd pretty nicely.
21:48 nonsenso i was extending a brick and accidentally added a local filesystem which is on the root filesystem as a brick.
21:48 JoeJulian What I usually do is watch 'gluster volume heal $vol info' and if it comes back empty one time, I consider it in sync.
21:49 nonsenso so i went to remove the brick, which seems to be running well, but apparently in my haste, i added two bricks to the local filesystem.
21:49 _dist right, but if I was running 20 VMs, that might never happen (even if all were healed)
21:49 nonsenso can i stop a balance and remove another brick and continue the rebalance?
21:49 JoeJulian _dist: I am, but my workload is obviously different.
21:51 JoeJulian nonsenso: If you're removing a brick, it's not just a rebalance but a migration, though it'll probably fail and migrate files to the brick you're removing so that's moot.
21:51 nonsenso yeah, i'm just worried about the brick that's on my root fs filling up root during the migration
21:51 _dist JoeJulian: do you think my proposed use of a single volume for many kvm vms is unwise? the iops and throughput are decent so far.
21:51 JoeJulian nonsenso: is this replicated?
21:52 nonsenso yep.
21:52 JoeJulian _dist: Not in my opinion.
21:52 JoeJulian nonsenso: Is the "root" brick replicated to a non-root one?
21:52 nonsenso JoeJulian: the orig bricks are 1t.  root filesystem is 8gb and quickly filling.
21:52 nonsenso JoeJulian: no it's replicating to another root on.
21:52 nonsenso s/on/one
21:53 JoeJulian gah
21:53 nonsenso i done fucked up.
21:54 JoeJulian I'd stop the rebalance. replace-brick each to the correct brick. then I'd start the rebalance again.
21:55 JoeJulian I suspect the current brick is where you intended to mount some filesystem, so you'll have to (at least temporarily) mount that filesystem elsewhere and replace-brick to that elsewhere.
21:55 nonsenso crap i have to afk.  i'll be back in a few.  parental unit at airport.
21:55 nonsenso JoeJulian: thanks as always, btw.  :)
21:55 JoeJulian You're welcome.
21:57 _dist JoeJulian: Ok, so the only thing I've had trouble verifying (because of the always healing list) is that even though I've mount.gluster brick1:/vol /localplace it seamlessly knows to go to brick2:/vol if brick1:/vol is behind. I know that's likely a core function, I just don't want to make any assumptions
21:59 haritsu joined #gluster
21:59 JoeJulian It is. Once a tcp connection to one brick of a replica pair is closed (or times out after ping-timeout seconds), the remaining brick continues to be used. The xattrs continue to be incremented. When the absent brick returns, those attrs are used to inform the client or the self-heal daemon to perform the heal.
22:00 kobiyashi @JoeJulian  now that your back, can you tell me how concerned i should be when i constantly see this scrolling in my 2x2 dist/replication /var/log/glusterfs/bricks/static-content.log
22:00 kobiyashi [2013-11-26 21:33:03.158646] E [marker-quota-helper.c:229:mq_dict_set_contribution] (-->/usr/lib64/glusterfs/3.4.1/xlator/debug/io-stats.so(io_stats_lookup+0x157) [0x7f121ef662e7] (-->/usr/lib64/glusterfs/3.4.1/xlator/features/marker.so(marker_lookup+0x2f8) [0x7f121f17bfc8] (-->/usr/lib64/glusterfs/3.4.1/xlator/features/marker.so(mq_req_xattr+0x3c) [0x7f121f185f1c]))) 0-marker: invalid argument: loc->parent
22:00 JoeJulian _dist: Oh... and ,,(mount server)
22:00 glusterbot _dist: The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrdns
22:00 _dist what do you mean by mount server?
22:01 JoeJulian in your example, server1
22:01 JoeJulian er, brick1
22:01 _dist right, I built a 10gbe san for it, glusterbot told me what I wanted to hear :)
22:01 JoeJulian your example said "brick1:/vol" where brick1 is the mount server.
22:01 _dist right
22:01 andreask joined #gluster
22:01 _dist I actually named them node1 and node2 in this case
22:02 * JoeJulian shudders...
22:02 JoeJulian I die a little inside every time someone uses the word "node", fyi.
22:03 _dist well, melchior gaspar and balthasar were taken as the main host names and the bridge interface would get confused :) I can't use IP cause you can never change them (in my experiene)
22:03 JoeJulian yep
22:03 kobiyashi JoeJulian can you please help me interpret my log entry
22:04 JoeJulian Just a "node" is *any* endpoint. People start throwing that word around willy-nilly and you never know if they're referring to a server, a client, whatever...
22:05 kobiyashi what do you call an irish pet server.....?
22:05 kobiyashi a node-er dame.....
22:05 JoeJulian kobiyashi: That says that marker-quota-helper's mq_dict_set_contribution received an invalid argument. I have 10 minutes before I have to leave and I don't think that's enough time to read the source for that routine and make a guess as to why that's happening.
22:06 kobiyashi ok i'll be sure to re-hash tthis with you...
22:06 * JoeJulian tries to get his $dayjob done too!
22:06 y4m4_ joined #gluster
22:07 kobiyashi i'm whining...i was just trying to get your expert advise
22:07 JoeJulian :-)
22:07 kobiyashi sorry /s/i'm/i'm not
22:07 JoeJulian No worries. Just making sure you know I'm not simply ignoring you without cause.
22:09 _pol joined #gluster
22:11 samppah joined #gluster
22:15 haritsu joined #gluster
22:22 gdubreui joined #gluster
22:33 _dist JoeJulian I assume you've left, but I've come across a scenario where I'm performing an add-brick and during the process I have mount.glusterfs, the inside of my local mount is the data on my local brick, not the data of the replica cluster (filesizes that are 0 etc)
22:35 _dist as soon as I attempt to access the file, it's there, full size. But a CP the first time would fail, a tail would be empty until the second attempt
22:40 dbruhn gah an exclusion from the main repo excludes it from all repos....
22:43 glusterbot New news from resolvedglusterbugs: [Bug 917272] program to just set trusted.glusterfs.volume-id, for running glusterfs standalone <http://goo.gl/tuuHZ>
22:45 a2 social, do you have geo-rep configured on your volume?
22:50 geewiz joined #gluster
22:57 mattapp__ joined #gluster
23:05 Gilbs1 left #gluster
23:05 _dist left #gluster
23:13 social a2: yes, in test case there's geo-replication off the tested volume where issue happends
23:15 haritsu joined #gluster
23:16 a2 social, so geo-rep was reading off the volume?
23:17 social yes
23:29 cjh973 can you deep mount with the fuse client?
23:29 cjh973 or is that only the nfs client?
23:36 mattapp__ joined #gluster
23:39 mattappe_ joined #gluster
23:40 mattapp__ joined #gluster
23:56 mattapp__ joined #gluster
23:58 andreask joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary