Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-07-31

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:17 MacWinner joined #gluster
00:30 p0LL3R joined #gluster
00:41 bala joined #gluster
00:56 weykent joined #gluster
00:58 zerick joined #gluster
00:59 Peter2 joined #gluster
01:06 m0zes joined #gluster
01:13 gildub joined #gluster
01:51 bala joined #gluster
01:53 luckyinva joined #gluster
01:55 hagarth joined #gluster
01:57 recidive joined #gluster
02:00 chucky_z joined #gluster
02:00 chucky_z hola
02:00 chucky_z is it normal for gluster to use ~50% cpu when first setup?
02:21 chucky_z ahh, looks like self-heal is running? afr_dir_exclusive_crawl
02:21 chucky_z how long is this expected to take?  not too many files other than one directory
02:22 chucky_z (~3000 throughout 40 directories, than ~300,000 in one directory split by a few dirs)
02:34 haomaiwa_ joined #gluster
02:41 harish_ joined #gluster
02:50 haomaiw__ joined #gluster
02:53 coredump joined #gluster
02:56 bharata-rao joined #gluster
03:00 haomaiwa_ joined #gluster
03:03 haomai___ joined #gluster
03:10 luckyinva joined #gluster
03:12 nbalachandran joined #gluster
03:12 haomaiwa_ joined #gluster
03:13 Peter2 joined #gluster
03:14 DV joined #gluster
03:14 haomaiw__ joined #gluster
03:16 nishanth joined #gluster
03:24 spandit joined #gluster
03:37 lalatenduM joined #gluster
03:44 shubhendu joined #gluster
03:44 jobewan joined #gluster
03:44 haomaiwa_ joined #gluster
03:47 Humble joined #gluster
03:48 itisravi joined #gluster
03:52 ndk joined #gluster
03:53 kanagaraj joined #gluster
03:57 JoeJulian Please take a moment to become an openstack member (it's free) and vote for my presentation: https://www.openstack.org/vote-paris/Presentation/openstack-on-glusterfs-on-open-compute-a-reference-design
03:57 glusterbot Title: OpenStack on GlusterFS on Open Compute: A reference design (at www.openstack.org)
04:07 kdhananjay joined #gluster
04:10 dusmant joined #gluster
04:14 meghanam joined #gluster
04:14 meghanam_ joined #gluster
04:22 atalur joined #gluster
04:24 Rafi_kc joined #gluster
04:24 anoopcs joined #gluster
04:27 cjhanks joined #gluster
04:39 jiffin joined #gluster
04:40 aravindavk joined #gluster
04:47 ppai joined #gluster
04:49 ramteid joined #gluster
04:49 atinmu joined #gluster
04:49 ricky-ti1 joined #gluster
04:58 psharma joined #gluster
04:59 ndarshan joined #gluster
05:00 sputnik13 joined #gluster
05:02 sahina joined #gluster
05:03 glusterbot New news from resolvedglusterbugs: [Bug 764655] NetBSD port <https://bugzilla.redhat.com/show_bug.cgi?id=764655>
05:05 bala joined #gluster
05:11 prasanth_ joined #gluster
05:11 karnan joined #gluster
05:12 lalatenduM joined #gluster
05:18 prasanth|offline joined #gluster
05:19 prasanth_ joined #gluster
05:40 prasanth|offline joined #gluster
05:44 overclk_ joined #gluster
05:46 aravindavk joined #gluster
05:56 * prasanth|offline is away: I'm busy
06:00 lalatenduM joined #gluster
06:08 benjamin_ joined #gluster
06:09 psharma joined #gluster
06:15 rastar joined #gluster
06:15 vpshastry joined #gluster
06:19 gEEbusT joined #gluster
06:20 kumar joined #gluster
06:20 aravindavk joined #gluster
06:21 Humble joined #gluster
06:28 kanagaraj_ joined #gluster
06:30 Humble joined #gluster
06:43 kanagaraj joined #gluster
06:49 ekuric joined #gluster
06:57 rjoseph joined #gluster
06:57 ctria joined #gluster
07:04 glusterbot New news from newglusterbugs: [Bug 1125134] Not able to start glusterd <https://bugzilla.redhat.com/show_bug.cgi?id=1125134>
07:29 Intensity joined #gluster
07:29 fsimonce joined #gluster
07:41 stickyboy Turns out deleting 100,000 files + corresponding .glusterfs hard links from brick is ... slow.
07:42 stickyboy But ultimately worth it, as these are some split brain directories...
07:43 JoeJulian Next time I would try to avoid split-brain. It makes things much easier.
07:46 ricky-ticky joined #gluster
07:54 sputnik13 joined #gluster
07:57 Humble joined #gluster
08:06 harish_ joined #gluster
08:14 ricky-ticky1 joined #gluster
08:26 richvdh joined #gluster
08:32 overclk_ joined #gluster
08:40 Slashman joined #gluster
08:54 deepakcs joined #gluster
09:10 vimal joined #gluster
09:11 nbalachandran joined #gluster
09:17 ira joined #gluster
09:24 Pupeno joined #gluster
09:52 djav_ joined #gluster
09:52 djav_ hi! anybody around who has played with docker and glusterfs?
09:53 djav_ I'm trying to figure out to which extend the data volumes created are persistent even if all the containers go down
09:54 hchiramm joined #gluster
10:15 calum_ joined #gluster
10:28 saltsa joined #gluster
10:29 LebedevRI joined #gluster
10:30 ricky-ticky joined #gluster
10:45 rastar joined #gluster
10:56 edward1 joined #gluster
10:57 diegows joined #gluster
11:03 ndk joined #gluster
11:04 ricky-ticky joined #gluster
11:06 kkeithley joined #gluster
11:07 gildub joined #gluster
11:27 prasanth_ joined #gluster
11:35 glusterbot New news from newglusterbugs: [Bug 1117888] Problem when enabling quota : Could not start quota auxiliary mount <https://bugzilla.redhat.com/show_bug.cgi?id=1117888> || [Bug 1119827] Brick goes offline unexpectedly <https://bugzilla.redhat.com/show_bug.cgi?id=1119827>
11:51 kanagaraj joined #gluster
11:56 harish_ joined #gluster
12:05 glusterbot New news from resolvedglusterbugs: [Bug 1113007] nfs-utils should be installed as dependency while installing glusterfs-server <https://bugzilla.redhat.com/show_bug.cgi?id=1113007> || [Bug 1120151] Glustershd memory usage too high <https://bugzilla.redhat.com/show_bug.cgi?id=1120151> || [Bug 1113749] client_t clienttable cliententries are never expanded when all entries are used <https://bugzilla.redhat.com/show_bug.cgi?id=1113749> || [Bug 111
12:13 itisravi_ joined #gluster
12:13 DV joined #gluster
12:13 jbrooks joined #gluster
12:19 nbalachandran joined #gluster
12:22 luckyinva joined #gluster
12:26 karnan joined #gluster
12:43 djav joined #gluster
12:48 deepakcs joined #gluster
12:49 hchiramm joined #gluster
12:58 getup- joined #gluster
12:59 julim joined #gluster
13:01 cristov joined #gluster
13:04 julim joined #gluster
13:06 rwheeler joined #gluster
13:08 Pupeno_ joined #gluster
13:10 chirino joined #gluster
13:14 Pupeno joined #gluster
13:14 bala joined #gluster
13:14 ccha2 joined #gluster
13:14 bennyturns joined #gluster
13:15 FooBar_ joined #gluster
13:15 lava joined #gluster
13:15 ctria joined #gluster
13:15 karnan joined #gluster
13:15 tomased joined #gluster
13:15 xavih joined #gluster
13:15 harish_ joined #gluster
13:16 T0aD joined #gluster
13:16 purpleidea joined #gluster
13:16 JoeJulian joined #gluster
13:16 theYAKman joined #gluster
13:16 fsimonce joined #gluster
13:16 coredump joined #gluster
13:16 codex joined #gluster
13:16 ron-slc joined #gluster
13:16 RioS2 joined #gluster
13:19 gEEbusT joined #gluster
13:19 hflai joined #gluster
13:20 mibby joined #gluster
13:25 chucky_z joined #gluster
13:25 chucky_z hello, are there any conditions which would cause a massive amount of futex calls?
13:28 Maya_ joined #gluster
13:29 overclk_ joined #gluster
13:30 luckyinva joined #gluster
13:35 tdasilva joined #gluster
13:35 glusterbot New news from newglusterbugs: [Bug 1125277] ec-method.c fails to compile in function 'ec_method_encode' due to unknown register name 'xmm7' <https://bugzilla.redhat.com/show_bug.cgi?id=1125277>
13:55 hchiramm joined #gluster
13:56 edward1 joined #gluster
13:58 deepakcs joined #gluster
13:59 ctria joined #gluster
14:08 Maya_ joined #gluster
14:13 DV joined #gluster
14:15 ppai joined #gluster
14:22 anoopcs joined #gluster
14:23 getup- joined #gluster
14:23 wushudoin joined #gluster
14:23 hagarth joined #gluster
14:26 plarsen joined #gluster
14:35 glusterbot New news from newglusterbugs: [Bug 1125312] Disperse xlator issues in a 32 bits environment <https://bugzilla.redhat.com/show_bug.cgi?id=1125312>
14:36 xleo joined #gluster
14:40 ndk joined #gluster
14:42 stickyboy joined #gluster
15:04 recidive joined #gluster
15:08 cjhanks_ joined #gluster
15:08 daMaestro joined #gluster
15:11 Humble joined #gluster
15:13 tdasilva joined #gluster
15:14 lmickh joined #gluster
15:14 overclk_ joined #gluster
15:14 bala joined #gluster
15:15 overclk_ joined #gluster
15:20 ira joined #gluster
15:21 ndk` joined #gluster
15:27 p0LL3R joined #gluster
15:38 dusmant joined #gluster
15:45 dtrainor_ joined #gluster
15:46 dtrainor_ joined #gluster
15:48 chirino joined #gluster
16:09 Peter1 joined #gluster
16:14 Peter1 is there a way to do glusterfs client caching?
16:14 Peter1 i m getting IOWait on some glusterfs client
16:18 julim joined #gluster
16:41 Maya_ JoeJulian: Okay so it seems that the self-heal process is responsible for accessing 345MB of the swap! Self-heal completed a few days ago but the % of used swap space is still growing. Any ideas on what I should do?
16:44 p0LL3R joined #gluster
16:52 Humble joined #gluster
16:55 sputnik13 joined #gluster
17:01 bit4man joined #gluster
17:02 zerick joined #gluster
17:04 meghanam_ joined #gluster
17:05 meghanam joined #gluster
17:12 _dist joined #gluster
17:13 _dist I removed a brick a couple days ago from a replicate volume (3.4.2), and my client didn't like it. Is a remount of all clients recommended whenever you make a change to a volume?
17:14 JoeJulian Maya_: You can always kill the self-heal daemon: pkill -f glustershd
17:14 JoeJulian Maya_: Then restart glusterd again to start it back up.
17:15 JoeJulian _dist: Prior to 3.4.5, yes.
17:15 _dist JoeJulian: Thanks, my client locked up because the log file filled it's HDD. It seems like it's getting tons of "permission' denied now, and the shd is complaining about "no active sinks"
17:16 julim joined #gluster
17:18 _dist The brick has a bunch of this https://dpaste.de/dPae (how concerned should I be?)
17:18 glusterbot Title: dpaste.de: Snippet #277255 (at dpaste.de)
17:18 xoritor joined #gluster
17:20 JoeJulian probably not, but if you're going to use user xattrs, and you haven't done so, you need to mount your bricks with user_xattr
17:21 xoritor hi JoeJulian
17:21 JoeJulian o/
17:21 xoritor the bd-xlator stuff is working _AWESOME_
17:22 tdasilva joined #gluster
17:22 JoeJulian That's really cool to hear. I don't suppose you're a blogger and could write something up about that?
17:22 xoritor remember the question i had about removing a volume?   turns out you were right, just remove the file
17:22 JoeJulian smrt
17:22 p0LL3R left #gluster
17:22 _dist it's true that we aren't (using user_xattr i nthe mount option), but this problem is correlates directly to removing one of the replicate bricks. Do you think you could point me to the bug that was fixed in 3.4.5 so I can try to repair the damage done?
17:22 xoritor LOL... you tried to con me into doing the docs
17:23 JoeJulian Community involvement is what makes open source great!
17:23 xoritor JoeJulian, true... but writing docs sucks
17:24 * xoritor shrugs
17:24 xoritor just being honest
17:24 xoritor i agree wee need them
17:24 JoeJulian You start blogging some topic, the industry sees you as an expert, pays your way to fly all over the place and give talks, puts you up in nice hotels... It's fun.
17:24 xoritor but it just stucks
17:24 xoritor s/stu/su/
17:24 JoeJulian And yes. Everyone hates writing docs.
17:24 xoritor lol.... hey i TEACH this stuff
17:25 glusterbot xoritor: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
17:25 xoritor how about that?
17:25 JoeJulian ... but I don't ask anyone to write docs, just maybe make an edit here or there.
17:25 xoritor yea thats true... it was only to edit the bd-xlator stuff
17:25 xoritor fix any issues
17:25 JoeJulian actually, that's a lie.
17:25 xoritor i contract teach for RH
17:25 JoeJulian I do ask the devs to write docs.
17:26 JoeJulian I get kind-of bitchy about it even.
17:26 daMaestro joined #gluster
17:26 xoritor mainly things like the RH436
17:26 xoritor 402
17:26 xoritor err.. RH401
17:26 xoritor that kind of stuff
17:26 JoeJulian RH401, class not found.
17:26 xoritor satellite
17:26 xoritor 436 is clustering and storage
17:26 xoritor ie... glusterfs
17:27 xoritor also gfs2, cman, etc...
17:27 xoritor im in NJ next week teaching satellite
17:27 xoritor heh
17:27 JoeJulian See, I need to go get my coffee...
17:27 xoritor LOL
17:27 xoritor me too
17:27 JoeJulian It should have been RH401, Unauthorized.
17:28 xoritor lol
17:28 JoeJulian I make door hangers that I take to hotels and hang on the door for room 404. "Room Not Found"
17:28 xoritor that would have been funny
17:28 xoritor so really i am trying to learn
17:29 xoritor i just got the bd-xlator working with teamd and lacp
17:29 xoritor on libvirt
17:29 JoeJulian best way to learn is teaching, am I right? :D
17:29 xoritor using direct attachment (no bridge) to the team0
17:29 xoritor ;-)
17:29 xoritor teaching and doing
17:29 xoritor i only contract teach and do it the rest of the time
17:30 xoritor practice what i preach
17:30 _dist JoeJulian: Please teach me how to correct my comaplining volume :)
17:30 JoeJulian :P
17:30 xoritor _dist, whats the complaint
17:30 xoritor <interface type='direct'> <source dev='team0' mode='vepa'/>  <model type='virtio'/>
17:31 xoritor so cool....
17:31 _dist depends on the log, but I'd guess that somewhere in the self heal, or client it's still trying to resolve xattrs on a brick that was removed two days ago
17:31 xoritor darn fast lv replication
17:31 * JoeJulian needs to blog about "no active sinks"...
17:34 _dist this is what my fuse client complains about (it has been remounted since the change to the volume) https://dpaste.de/aysi
17:36 Maya_ joined #gluster
17:37 JoeJulian _dist: This is after remounting?
17:37 _dist yeap, 17:12 UTC is pretty recent
17:37 JoeJulian Just wasn't sure if you'd remounted though.
17:38 JoeJulian Any corresponding errors on any of the bricks?
17:38 _dist oh, I had to reboot because the OS drive got filled by logs :)
17:38 _dist let me check on that, I'm going to look in the brick and glusterd logs for something near that on both remaining brick hosts
17:40 _dist just an access request, the only other unusual thing is the sink stuff in the shd
17:41 _dist but I'm seeing new SETXATTR operations not permitted roll in realtime
17:44 _dist I haven't been able to find the bug (yet) that was fixed to prevent this
17:48 _dist at this point my only guess would be to restart the volume and hope these problems go away, but I can't do that until after hours
17:55 _dist new  error "[2014-07-31 17:50:56.394779] I [dict.c:370:dict_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.2/xlator/performance/md-cache.so(mdc_lookup+0x2ff) [0x7f76473c2b9f] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.2/xlator/debug/io-stats.so(io_stats_lookup_cbk+0xfe) [0x7f76471af54e] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.2/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x200) [0x7f7646f9b690]))) 0-dict: !t
17:55 glusterbot _dist: ('s karma is now -11
17:55 glusterbot _dist: ('s karma is now -12
17:55 glusterbot _dist: ('s karma is now -13
17:55 _dist :(
17:56 _dist wow how was it already at -10 :)
17:56 JoeJulian Actually, "(" is.
18:02 _dist right
18:02 _dist didn't notice that
18:02 JoeJulian It says the posix-acl lookups are the problem. Did you mount your bricks with acl?
18:03 _dist fstab entry "balthasar-gluster:/datashare_volume     /mnt/datashare_gluster  glusterfs       defaults,acl,_netdev 0 0" <-- but I'd find it way to coincidental that these problems only happened after removing a brick
18:03 glusterbot _dist: <'s karma is now -2
18:05 JoeJulian There's several remove-brick/replace-brick/add-brick problems prior to 3.4.5. One more after it that I'm still tracking down. All that I know about are client-side.
18:06 _dist I really appreciate your help, if I can just get through today without loosing the work people are doing I'm hoping a restart of the volume will clear out these problems
18:06 _dist last time (when it was my fault, adding a brick where client didn't have DNS for new brick) I did a state dump and used some claer lock command on the files that were complaining
18:07 _dist but the errors were much easier for me to understand then
18:08 JoeJulian If it makes you feel any better, unless there's an " E " it's not an error.
18:09 JoeJulian But yes, I do wish some of the informational or warning messages gave a little more of a clue what information or warnings they're trying to convey.
18:09 _dist there are Es, mostly about getxattr failed (no data available)
18:09 _dist but those are quite old NM
18:10 _dist looks like Ws for today
18:14 _dist the worst part about this error is I can't easily trigger it, it seems to only affect certain paths/files so far
18:16 _dist could just be coincidence, but a lot of them are on thumbs.db only
18:17 JoeJulian Well there's your problem... windows.
18:17 julim joined #gluster
18:17 _dist hah
18:18 _dist it's not exclusive though, I'm not sure if you can remove a software raid mirror in windows without issue :)
18:22 _dist is there anyway to confirm in these warning messages that it's just talking about my old brick?
18:25 JoeJulian file a bug report. Include client and brick logs, and "getfattr -m . -d -e hex" for any of the files/directories in question from the bricks. Include gluster volume info.
18:25 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
18:27 _dist even though I'm on 3.4.2?
18:30 _dist What should I make the title, I'm only seeing symptoms right now of what _might_ be a big problem
18:31 _dist I'll just put that it causes client errors
18:31 JoeJulian sounds good
18:32 luckyinva joined #gluster
18:34 ekuric joined #gluster
18:35 _dist so you want full logs since reboot? I can put them together in a zip
18:36 JoeJulian Sounds good.
18:37 _dist I assume the getfattr needs to run on the native FS on the brick correct?
18:38 JoeJulian correct
18:38 JoeJulian each replica
18:38 sputnik13 joined #gluster
18:40 _dist ok, so yo uwant a side by side comparison of an example file? One that complained
18:44 _dist JoeJulian: Maybe I'm not using getfattr command correctly, nothing comes back
18:44 JoeJulian root
18:45 _dist would "getfattr -d -e hex ./Thumbs.db" be correct?
18:45 _dist (I'm in as root)
18:45 JoeJulian You missed the: -m .
18:46 _dist cool, thanks didn't realize it was a attr pattern
18:47 JoeJulian trusted., secure, and one or two others are filtered out by default.
18:49 Peter1 can glusterfs client tune to be cached?
18:50 JoeJulian yes and no. What are you looking to gain?
18:50 _dist I care a lot about this bug, should I mark it has high or urgent?
18:50 JoeJulian ... and what are you willing to lose in return?
18:50 JoeJulian _dist: whatever. If the devs disagree they'll change it.
18:51 Peter1 i see high iowait on gluster mounts
18:51 Peter1 on the clients side
18:51 JoeJulian Ok, why?
18:51 Peter1 network got bottlenecked
18:51 Peter1 wonder if glusterfs client can do caching  on local disk io
18:51 JoeJulian So you want the data that's on the server to be cached on the client without going through the network?
18:51 Peter1 yes
18:51 Peter1 or buffer the write
18:51 Peter1 to the server
18:51 JoeJulian Would you like a unicorn with that? ;)
18:52 Peter1 i will take unicorn :)
18:52 JoeJulian You can turn on write-through caching...
18:52 Peter1 on client?
18:53 _dist ok logged it, I noticed there are alot of Es in the brick log actually, I attached it of course
18:53 mshadle joined #gluster
18:53 mshadle joined #gluster
18:54 JoeJulian Now... you've cached your write on client1. client2 wants to also make a change so it waits for the lock to be released and makes its change. How does client1 know that its cache is no longer valid?
18:55 Peter1 maybe i can lose the unwritten cache
18:55 Peter1 only take whatever commited on the server
18:57 JoeJulian Need to think about it from the perspective of every client at the same time and optimize for the greatest effect across the entire system, rather than focusing on the smallest task. imho.
18:58 _dist JoeJulian: So my take away from this should be while on 3.4.2-x never use add-brick, remove-brick ? (My previous add actually worked ok though)
18:59 niccarp89 joined #gluster
19:00 niccarp89 hello!
19:00 qdk joined #gluster
19:00 glusterbot niccarp89: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
19:00 diegows niccarp89, hi!
19:01 niccarp89 anybody knows how to activate extendend acls over a mounted gluster filesystem ??
19:03 daMaestro joined #gluster
19:05 Maya_ joined #gluster
19:07 glusterbot New news from newglusterbugs: [Bug 1125418] Remove of replicate brick causes client errors <https://bugzilla.redhat.com/show_bug.cgi?id=1125418>
19:08 niccarp89 rsync output error :    rsync: rsync_xal_set: lsetxattr(""/filerute"","security.NTACL") failed: Operation not supported (95)
19:09 niccarp89 fstab line:        storage01:/gv0 /srv/data glusterfs defaults,acl 0 0  .         also try with user_xattr  but not mount the share
19:13 semiosis niccarp89: what brick filesystem are you using?  xfs?  ext4?
19:14 niccarp89 xfs
19:14 niccarp89 the acl will be by default activate right?
19:14 _dist niccarp89: we are using ext4 with just acl (fstab) and everything works well (we use a linux host to act as an SMB gateway for windows)
19:15 semiosis niccarp89: correct, xfs always has acl
19:15 niccarp89 fstab of one brick:            /dev/sdc /data xfs defaults,noatime 0 0
19:16 diegows _dist, are you using vfs samba plugin? or are you mounting the volume and publishing it using samba?
19:16 * diegows works with niccarp89
19:16 _dist publishing via samba, used pbis for domain integration
19:17 _dist niccarp89: our fstab for the brick mount "/dev/chest/gluster_datashare-part1      /zvol/gluster_datashare ext4    acl,user_xattr,defaults"
19:18 diegows acls are enabled with detauls in xfs, I don't remember user_xattrs right now
19:19 diegows we should try
19:32 Peter1 semiosis: any eta on the 3.5.2-ubuntu? :)
19:32 Peter1 excited as i need the heal info command :)
19:34 diegows we tried user_xattrs in the bricks and nothing
19:34 diegows we stil have operation not supported setting xattrs
19:34 diegows user_xattrs in the client side doesn't work, it doesn't mount
19:34 diegows any hint?
19:34 Peter1 diegows, i got the same error too
19:35 diegows have you fixed?
19:35 diegows :)
19:35 Peter1 no :(
19:35 _dist our user_xattrs isn't on our client mount, but on our brick mount
19:35 _dist wait let me check that, I may have mis-spoke
19:35 diegows _dist, thanks!
19:36 _dist user_xattr on our brick mount, only acl on our client mount
19:36 _dist brick mounts "acl,user_xattr,defaults" client mount " glusterfs       defaults,acl,_netdev 0 0"
19:36 daMaestro joined #gluster
19:37 semiosis Peter1: real soon now
19:37 Peter1 Thanks!
19:37 _dist Peter1: what is the new heal info command?
19:37 Peter1 it was missing on the ubuntu distro
19:37 _dist oh in 3.5.2 ?
19:37 Peter1 3.5
19:37 _dist I wouldn't like that :)
19:38 Peter1 _dist is the server ubuntu?
19:38 _dist our client for mounting is yes
19:38 _dist ours servers with the bricks are debian wheezy
19:38 _dist all are running glusterfs-3.4.2-1
19:39 Peter1 oic, i wonder i need that acl,user_xattr,defaults on my bricks
19:39 _dist I'm not sure, I know for ext4 we did
19:40 Peter1 i m using xfs for my bricks
19:40 Peter1 i tho xattr comes default for xfs ?
19:41 _dist I actually think there was a reason we didn't go xfs, but it was over a year ago. Sorry I'm not sure on that just hazy memories
19:41 Peter1 that's fine :)
19:41 niccarp89 when i try to mount on gluster client with user_xattr give me this output:  Mount failed. Please check the log file for more details.
19:42 niccarp89 appoint to gluster log ? i dont see relevant info about mount theare
19:43 _dist once again our user_xattr is on the FS mount for the brick, not the gluster fuse mount
19:43 _dist (but I'm not saying it isn't gluster option, I don't know I just know we aren't using it)
19:44 niccarp89 Thanks _dist, i also try on the bricks, just to show more info in order to fix the error
19:51 Maya_ joined #gluster
20:05 Jokeacoke joined #gluster
20:06 Jokeacoke hello
20:06 glusterbot Jokeacoke: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
20:06 _dist hah, you ever take glusterbot for a bit passive aggressive? :)
20:07 Jokeacoke Where i can read some more information about a .gluster hidden directory?
20:07 glusterbot New news from newglusterbugs: [Bug 1125431] GD_OP_VERSION_MAX should now have 30700 <https://bugzilla.redhat.com/show_bug.cgi?id=1125431>
20:07 _dist Jokeacoke: not very well hidden I guess, http://joejulian.name/blog/what-is-this-new-glusterfs-directory-in-33/
20:07 glusterbot Title: What is this new .glusterfs directory in 3.3? (at joejulian.name)
20:08 Jokeacoke yeah! I was it
20:08 * _dist assumed you meant .glusterfs but there could be some new .gluster directory in > 3.4.2
20:08 JoeJulian What?!?! heal info wasn't missing in 3.5.1 on ubuntu...
20:09 Jokeacoke Was read that before
20:13 Jokeacoke I was just dissapoint while i share via nfs amount of data (more than 200TB) and gets nfs stale filehandler error everytime while client write to shared directory
20:16 Peter1 joined #gluster
20:16 andreask joined #gluster
20:19 sputnik13 joined #gluster
20:21 andreask1 joined #gluster
20:22 andreask joined #gluster
20:22 ira joined #gluster
20:45 andreask1 joined #gluster
20:47 andreask joined #gluster
20:48 Peter1 if i decided not to use NFS on gluster, is there a way i can turn nfs off?
20:52 calum_ joined #gluster
20:54 _dist Peter1: yes it's a volume option
20:54 _dist gluster volume set help
20:55 _dist JoeJulian: it looks like all these offending thumbs.dbs were created exactly when I removed the brick
20:56 JoeJulian That's wierd. Nomally it's just every time you open a folder in windows.
20:56 _dist yeah I know, that's crazy
20:56 JoeJulian Well, a folder that hasn't been opened before.
20:56 _dist date modified and created
20:57 Maya_ joined #gluster
20:57 JoeJulian What if you just delete them all?
20:57 _dist yeap, that's the plan, honestly they weren't there before
20:57 JoeJulian I would do a fuse mount, find exec rm
20:57 _dist I can do it from the SMB proxy server's mount
20:58 _dist I have stopped/started the volume as well by now
20:58 _dist so I'm going to watch the next heal and look for that sink stuff (what is that anyway?)
20:59 JoeJulian Ooh, you're still using fuse mounts for samba? The vfs is very nice.
21:00 _dist didn't know about it
21:00 Peter1 where can i get the deb for gluster?
21:00 _dist (till today when others were takling about it)
21:00 JoeJulian source and sink are replication terms. Source should have ,,(extended attributes) that show a needed self-heal. Sink is the target that needs the heal.
21:00 _dist Peter1: most are stored in the download section of gluster
21:00 JoeJulian @ppa
21:00 glusterbot (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/
21:00 glusterbot JoeJulian: The official glusterfs packages for Ubuntu are available here: 3.4 stable: http://goo.gl/u33hy -- 3.5 stable: http://goo.gl/cVPqEH -- introducing QEMU with GlusterFS 3.4 support: http://goo.gl/7I8WN4
21:01 JoeJulian _dist: When you removed a brick, did you change the replica count?
21:01 _dist JoeJulian: yeap, it won't let you unless yo udo
21:02 _dist so what does this mean "0-gvms2-replicate-0: no active sinks for performing self-heal on file <gfid:e51d7867-b306-4ec5-ae96-2c6de69c172c>" ?
21:02 JoeJulian Sure it will, if you have a distributed volume or even a distributed replicated volume.
21:02 _dist pure replicate
21:02 JoeJulian That explains everything.
21:03 _dist cool, can you translate? :)
21:04 JoeJulian Trying to find the bug...
21:07 xleo joined #gluster
21:08 _dist Also, now that you have a hint of the problem, is there damage, if so is it severe?
21:09 JoeJulian no damage
21:11 * _dist likes that kind of thing
21:12 JoeJulian bug 1104861
21:12 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=1104861 unspecified, unspecified, ---, pkarampu, CLOSED WONTFIX, AFR: self-heal metadata can be corrupted with remove-brick
21:14 _dist there were no pending self heals
21:15 _dist that heal info would report anyway
21:16 * _dist reads http://www.gluster.org/community/documentation/index.php/Features/persistent-AFR-changelog-xattributes
21:17 _dist yeah this doens't match my scenario, I removed while all three were up, and watch -n1 showed no pending heal stuff.
21:18 JoeJulian I'm quite positive that's the problem.
21:18 JoeJulian what's your bug id?
21:18 _dist https://bugzilla.redhat.com/show_bug.cgi?id=1125418
21:18 glusterbot Bug 1125418: high, unspecified, ---, gluster-bugs, NEW , Remove of replicate brick causes client errors
21:19 _dist is this the simplest way to get a filename from gfid ? https://gist.github.com/harshavardhana/6747924
21:19 glusterbot Title: GlusterFS GFID to File conversion based on inode number (at gist.github.com)
21:20 _dist (weird that the no active sinks error doesn't post to heal-failed) ? Maybe I'm misunderstanding the nature of it
21:20 JoeJulian @gfid resolver
21:20 glusterbot JoeJulian: https://gist.github.com/4392640
21:21 JoeJulian find $brick_dir -exec setfattr -x trusted.afr.datashare_volume-client-2 {} \;
21:21 JoeJulian on both bricks
21:22 _dist that'll fix what happened?
21:23 JoeJulian More or less.
21:23 JoeJulian There's still a chance that it will expose a split-brain condition.
21:23 JoeJulian But that's easily fixed as well.
21:25 _dist I'm not "checking your math", but I gather files out there will have a client=X where x is not 1 or 2 ?
21:26 JoeJulian No, the possibility would be where one would have a client-1 with a non-zero value and the other would have a client-0 with a non-zero value.
21:26 JoeJulian The client-2 is a left-over from removing a brick. If the third brick was the one you removed, then it's a non-issue.
21:27 _dist got it, os the only part I need to modify is the brick dir
21:27 _dist so*
21:27 _dist I also assume I need to take the volume offline to do this?
21:27 JoeJulian As a complete aside, I never remove a brick to do maintenance on it. I just shut it down and do it.
21:28 JoeJulian No, you can get away with doing that live.
21:30 _dist I was changing the underlying FS :)
21:30 _dist also my .glusterfs directories do not have a gf sub
21:30 _dist so that scripts won't work
21:30 _dist script*
21:31 JoeJulian The gfid is a hexidecimal sequence, 8-4-4-4-12
21:31 JoeJulian You don't put "gfid" in front of it.
21:32 firemanxbr joined #gluster
21:32 _dist it's running, but a lot of the files are saying no such attribute
21:33 _dist (I didn't put gfid in front) :)
21:33 JoeJulian Nothing to worry about
21:33 JoeJulian Yes you did or it wouldn't have looked for a "gf" directory. "G" is not a hexidecimal digit.
21:34 * _dist is completely incorrect
21:36 _dist ok, it's running on both, I'm a little frightful
21:38 _dist worst case I suppose I can rebuild the volume from a backup, but that means a late night
21:38 _dist I'm also seeing another message
21:39 _dist JoeJulian: ./.glusterfs/0d/dc/0ddc1d19-8b6b-4573-8096-d3c826457608: Too many levels of symbolic links
21:39 recidive joined #gluster
21:41 _dist I think the GFIDs that the shd log errors are pointing to only exist on the removed brick?
21:41 JoeJulian kill it and add a "-h" to the setfattr
21:42 _dist done, but it still traverses the .glusterfs directory
21:42 _dist is that ok?
21:42 JoeJulian Yep, that's fine.
21:43 _dist this might be the first time I've ran a command on someone's instruction that I did not fully understand myself :)
21:46 _dist so is this a _real_ fix, or should I migrate to another volume later?
21:46 JoeJulian It's a real fix.
21:47 JoeJulian It's just cleaning up xattrs that are no longer valid.
21:50 _dist if we can confirm my issue was the same, I should close my bug
21:54 _dist one brick done, second still running
21:54 _dist both done
21:54 _dist what should I check ?
21:55 JoeJulian I guess just look and see if your concerns stopped being logged.
21:56 _dist Looks like the client errors stopped
21:56 bala joined #gluster
21:57 _dist O
21:57 _dist I'll run a heal now and watch the shd
21:58 _dist looking good so far, you said there's a chance the heal might turn up a split-brain file?
22:00 JoeJulian slim chance
22:02 _dist self heal did find another "no active sinks" file
22:04 _dist it doesn't appear to exist in either .glusterfs directory
22:06 _dist seems like it's just a few files per crawl, but none of the gfids are present in .glusterfs on either brick
22:06 _dist also, I really appreciate your help, and hope anxious doens't translate to demanding over irc :)
22:07 _dist cd ..
22:08 _dist not the worst thing to type in the wrong place
22:10 JoeJulian _dist: No worries.
22:10 _dist these self heal errors, any idea how I get rid of them? I can't find the gfids on either brick
22:13 JoeJulian Look in .glusterfs/indices/xattrop
22:14 _dist empty on both bricks
22:15 JoeJulian Not sure then... maybe a heal...full?
22:15 _dist sure, I'll give that a try
22:15 _dist also maybe a volume stop/start? I'm just taking guesses though
22:15 JoeJulian Nah, I'd go with the heal
22:16 JoeJulian I don't think a stop/start will actually do anything. This isn't Windows.
22:16 _dist ok running
22:16 _dist yeah I think you're right, I've only actually had to stop/start to deal with lock issues
22:20 _dist I really like proxomx, but being tied to their custom repos _is_ the cause of this, and likely the VM healing issue (well that and the bug presence). Is there a way they could backport these fixes? I'm looking into openstack on my home setup, last pass on it I wasn't happy
22:21 _dist happened again during full heal, same gfids as last time
22:24 _dist oh, interesting
22:24 _dist JoeJulian: these heal issues (no active sinks) are gfids that resolve to files in my vm volume, and are in that xattrop directory
22:25 _dist (so not related to the fileshare issue at all) which I'd say is fixed
22:25 _dist but while we're on the topic, what could the error mean for a VM file?
22:28 JoeJulian It seems to mean that the destination for the self-heal is unreachable.
22:28 JoeJulian But I don't know how accurate that statement is. I see those all the time and it's clearly connected and the files are in sync.
22:28 _dist the file is a running VM, I have to imagine that this error has gone unnoticed for some time
22:29 _dist I suspect it might be related to them showing up in heal info, is that reasonable?
22:29 JoeJulian yep
22:29 JoeJulian It's probably all about how active that VM is.
22:33 _dist I wonder if I should just turn shd off on my VM volume
22:37 _dist is this ok ? "trusted.afr.gvms2-client-0=0sAAAAAgAAAAAAAAAA"
22:38 _dist looks similar to others, I need to learn more about the gluster xattrs, I'm going to close my ticket and ref the one you showed me
22:38 JoeJulian dump it as hex, ie. -e hex
22:38 _dist oh right
22:39 _dist 0=good
22:39 _dist oh -x removes it, I finally get what your command did now
22:40 JoeJulian Oh, good.
22:40 _dist and, it should be easy to write an ssh script to check this on all VMs, which'll give me a real heal info
22:42 _dist ok, so this is why this one shows up
22:42 _dist "trusted.afr.gvms2-client-0=0x000000020000000000000000"
22:43 _dist (1 & 2) are the same
22:44 JoeJulian That says at the moment you performed that command, there were 2 pending data operations in process for client-0
22:45 _dist ah, it doens't change and it's the same for both client 0 and 1
22:45 _dist is there are chart for the hex codes so I can know what means what?
22:45 JoeJulian @extended attributes
22:45 glusterbot JoeJulian: (#1) To read the extended attributes on the server: getfattr -m .  -d -e hex {filename}, or (#2) For more information on how GlusterFS uses extended attributes, see this article: http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/
22:45 JoeJulian See #2
22:50 mjrosenb joined #gluster
22:50 _dist ok, I get it. Except I'll probably need to read the source to get the true value of the hex code?
22:51 jruggiero joined #gluster
22:51 JoeJulian They're just counters.
22:51 _dist sure, but each for different operations I assume
22:52 JoeJulian Like it says, data, metadata, namespace
22:53 _dist for some reason this VM believes (indefinitely?) that whatever operation goes in that slot, there's two of it to go but evne the I use the VM etc and it's responsive the counter never goes back to 0
22:54 JoeJulian Shutdown the VM and it will.
22:54 _dist How come? (I can't take this particular one down to see)
22:57 JoeJulian Because then the VM won't be writing to it.
22:57 _dist ok, I found a vm I could shutdown that had it
22:57 JoeJulian btw... 3.5.2 isn't supposed to falsely report that as needing healed.
22:58 _dist I'm running 3.4.2 though
22:58 _dist vm is off, 2 turned 1, still hanging at 1 though
22:58 JoeJulian There's nothing I can do about that other than to tell you what you're missing by not upgrading.
22:58 _dist I know, like I said I'd love to upgrade
22:59 _dist I have a plan in works for it, it'll just take time
22:59 JoeJulian I know about taking time. But critical bugs that will cause my clients to crash has a direct impact on my customers making emergency change windows required.
23:00 _dist both bricks (with the vm off) believe that both clients currently have 1 pending "something"
23:00 _dist I don't consider the vm reporting issue critical, and I can avoid using remove-brick for now. I'm thinking a month for migration to a new platform
23:00 JoeJulian If you're running anything less than 3.4.5, do not rebalance. Less than 3.4.6, do not add or remove bricks.
23:01 _dist I appreciate that, I added a brick ok before. But I'll stay clear of it until I upgrade
23:01 JoeJulian If the VM is off, you should be able to safely set those to 0 on both bricks.
23:02 JoeJulian Adding, removing, or replacing bricks with open fds (like VMs) can cause a client crash.
23:02 JoeJulian I'm working with pranithk to isolate what causes that.
23:02 _dist but that isn't a problem in 3.5.2 ?
23:03 JoeJulian Should be
23:03 _dist shouldn't* ?
23:03 JoeJulian No, should.
23:03 _dist ah
23:03 _dist maybe the remove and add brick commands should have a disclaimer since they are dangerous?
23:03 JoeJulian luckily I have the resources and the time to dedicate to finding and reporting these kinds of bugs.
23:04 JoeJulian They didn't know until I found this.
23:05 _dist I really love gluster, but I assume the production user base must be pretty small
23:05 JoeJulian Seems to be upwards of 30k.
23:06 JoeJulian I imagine a large percentage of that uses static volumes.
23:06 _dist but, I've seen people recommending using bricks as software raid instead of raid. If anyone did that, well don't do that :)
23:07 _dist and perhaps users who use it for VMs are less vocal, or perceptive than I am. I'm not critisizing the community, my intention is to stick around and contribute where I can.
23:08 delhage joined #gluster
23:09 _dist alright, well I really appreciate your help in stripping out the old client xattr, truly. It's time very a late supper :)
23:10 * _dist guesses that no active sinks means it doesn't know what to do, cause everyone agrees, but there are pending ops
23:11 JoeJulian Did you remove the first brick?
23:12 JoeJulian It shows a pending write for both itself and it's replica. But it can't self-heal that because that represents a pending operation. The self-heal daemon doesn't have any way of knowing whawt that operation is.
23:12 JoeJulian If it only showed a write for its replica, then the replica would be defined as the sync.
23:13 JoeJulian er, sink
23:13 JoeJulian When all you have is fools, though, you cannot define source and sink.
23:53 bgupta joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary