Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-07-26

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 bennyturns Peter1, here is what I do to clean everything up http://pastie.org/9421434
00:00 Peter1 sure
00:00 glusterbot Title: #9421434 - Pastie (at pastie.org)
00:02 fubada joined #gluster
00:02 Peter1 ic…make sure all brick process die...
00:04 bennyturns ya
00:08 Peter1 hmm i think i was wrong….still seeing same hang on ls at the beginning
00:08 Peter1 i tho enabling acl on brick help...
00:09 overclk joined #gluster
00:10 Peter1 got these error when i do ls
00:10 Peter1 http://pastie.org/9421444
00:10 glusterbot Title: #9421444 - Pastie (at pastie.org)
00:12 verdurin joined #gluster
00:13 bennyturns dies it return with anyhting?
00:13 Peter1 yes
00:13 Peter1 it eventually get through it
00:13 Peter1 and i noticed these on other export
00:13 Peter1 W [socket.c:522:__socket_rwv] 0-NLM-client: readv on 10.40.4.64:47334 failed (No data available)
00:15 Peter1 and this I [dht-common.c:635:dht_revalidate_cbk] 0-sas02-dht: mismatching layouts for
00:18 bennyturns Peter1, you are using gigabit, right?
00:19 Peter1 10g
00:19 bennyturns how long does your ls actually take on yout 10k files dir?
00:19 Peter1 clients are gigE
00:19 bennyturns right so you will only get gigabit speeds from the client
00:19 Peter1 right
00:19 MacWinner joined #gluster
00:20 Peter1 running a time on ls
00:20 Peter1 sometime is hangs for a looooooong time
00:20 bennyturns it should take a bit of time over GB
00:20 Peter1 just getting another one that hang.....
00:21 bennyturns like minutes?
00:21 bennyturns not 10s of minutes
00:21 Peter1 i have a dev system too running on dual gigE and didn't take that long
00:21 Peter1 i think this one is hung
00:22 bennyturns yep something is funky here just trying to get idea of how long ls hsould take
00:22 Peter1 right
00:22 bennyturns I'm still creteing files, I'll tell ya what my 10G system takes
00:22 bennyturns I don't have any 1G up atm
00:22 Peter1 ic
00:22 Peter1 ya very funky
00:23 mojibake JoeJulian: Thank you. add-brick $vol replica 3 newserver:/path is what I was looking for, just did not know how to ask it. I will keep researching and reviewing documentation to avoid "doing it wrong." because I am definitely looking to do it right.
00:24 Peter1 just ran that on my dev
00:24 Peter1 real0m2.117s
00:24 Peter1 user0m0.192s
00:24 Peter1 sys0m0.112s
00:24 Peter1 but my prod is hang now
00:24 JoeJulian That just sounds wrong...
00:27 Peter1 got to kill that ls and rerun and seems fast again
00:27 Peter1 something really funky going on
00:28 bennyturns gluster will cache stats, 1st run will be slow others following will be faster
00:29 JoeJulian correction, when using NFS, the kernel will cache stats.
00:30 bennyturns oh yeah :P
00:31 bennyturns keep forgetting we are mounting nfs :)
00:33 bennyturns for me on 10G with 150000 files ls -l took real2m57.047s
00:33 dcope joined #gluster
00:34 Peter1 it's so strange….strace ls -> no hang
00:34 Peter1 ls  -> hang :(
00:36 Peter1 this nfs intermittant issue is annoying
00:38 bennyturns I wish I could be more helpful Peter1 :(  gonna have to log for a bit though
00:39 Peter1 np. Been thankful to your help!!!
00:39 Peter1 great to know someone working with and not alone
00:39 bennyturns keep me updated!  also, we may want to email gluster-users list for some feedback from others
00:39 Peter1 sure!
00:40 bennyturns ttyls!  GL!
00:46 sputnik13 joined #gluster
00:48 DV_ joined #gluster
01:00 Peter1 i keep getting these I [afr-self-heald.c:1687:afr_dir_exclusive_crawl] 0-sas01-replicate-1: Another crawl is in progress for sas01-client-2
01:00 Peter1 why there are always multiple crawl happening on my volume?
01:17 Andreas-IPO joined #gluster
01:17 sputnik13 joined #gluster
01:19 diegows joined #gluster
01:24 theron joined #gluster
02:10 diegows joined #gluster
02:33 hagarth joined #gluster
03:40 kshlm joined #gluster
05:13 sputnik13 joined #gluster
06:15 kumar joined #gluster
06:16 kumar joined #gluster
06:41 LebedevRI joined #gluster
06:51 ramteid joined #gluster
07:06 cultavix joined #gluster
07:11 LebedevRI joined #gluster
07:22 jobewan joined #gluster
07:22 RobertLaptop joined #gluster
07:28 JustinCl1ft joined #gluster
07:28 msvbhat_ joined #gluster
07:31 cultavix joined #gluster
07:39 ricky-ti1 joined #gluster
07:42 vu joined #gluster
07:46 Humble joined #gluster
08:26 anoopcs joined #gluster
08:35 anoopcs joined #gluster
08:51 ricky-ti1 joined #gluster
09:03 anoopcs joined #gluster
09:07 anoopcs joined #gluster
09:09 cultavix joined #gluster
09:41 stickyboy joined #gluster
09:54 jiku joined #gluster
10:09 hchiramm joined #gluster
10:24 XpineX joined #gluster
12:10 cultavix joined #gluster
12:19 cultavix joined #gluster
12:44 diegows joined #gluster
12:50 sputnik13 joined #gluster
12:50 rotbeard joined #gluster
13:13 stickyboy joined #gluster
13:19 LebedevRI joined #gluster
13:24 richvdh joined #gluster
13:26 swebb joined #gluster
13:41 cultavix joined #gluster
14:10 XpineX joined #gluster
15:21 an joined #gluster
15:22 kumar joined #gluster
15:47 rotbeard joined #gluster
15:55 rotbeard joined #gluster
16:52 anoopcs joined #gluster
17:06 jiku joined #gluster
17:15 anoopcs1 joined #gluster
17:27 Maya__ joined #gluster
17:44 Maya__ Hi there- I'm a Gluster newbie and have an issue with self-heal. My situation is a little unusual in that when I created a volume in Replica mode between 2 bricks, one of the bricks already contained 600GB of files. When I triggered self-heal on the empty node, files seemed to be replicating across but then stopped and I can't seem to start the self-heal daemon again. I'm running on 3.5.1 and my glustershd.log file is
17:44 Maya__ reporting "Skipping entry self-heal because of gfid absence" for everything. Also, when I run the heal info command, gluster reports "Volume heal failed". Can anyone help me out?!
17:46 JoeJulian Maya__: Technically what you're describing is "unsupported" and produces "undefined results" according to the developers. Granted, however, we've been doing it that way for years. What version are you using?
17:46 Maya__ Hi Joe, I'm on the latest version, 3.5.1
17:47 JoeJulian Try mounting your volume somewhere and walking that client mount with "find | xargs stat >/dev/null"
17:49 Maya__ Terrific, thanks, I'll give that a try!
17:58 RioS2 joined #gluster
18:05 1JTAAZTOO joined #gluster
18:09 chirino joined #gluster
18:12 plarsen joined #gluster
18:13 Philambdo joined #gluster
18:32 Christian87 joined #gluster
18:32 Christian87 Hello
18:32 glusterbot Christian87: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
18:34 Christian87 I would like to run my gluster setup as something like raid5 is this now possible? i only find workarounds which are written in 2009
18:34 JoeJulian no
18:36 JoeJulian With the new-style replication that's slated for 3.6, that may become possible though.
18:36 Christian87 so a setup with lets say 5 server with 20gig space each where i can use 80gig and if one server fails its no problem is not possible ?
18:37 JoeJulian With 5 20 gig servers, you can have 50 gigs of replicated fault-tolerant storage
18:39 JoeJulian but really... having 32 gigs per server is really cheap and gives you all 80 gigs, replicated. I think I have enough money for that in the ashtray of my car.
18:39 Christian87 I see so the most usable solution right know would be using always 2 pairs of servers to add to a replicated 2 volume
18:39 JoeJulian Are you doing this with Raspberri Pi?
18:40 JoeJulian yes
18:40 Christian87 That was just an example I want to use it in a setup where every server has 24TB of storage
18:41 JoeJulian Well then... that raises the cost per TB a bit more...
18:41 JoeJulian Though an RaspPi home gluster server cluster would be kind-of fun...
18:42 Christian87 thats true ;)
18:43 JoeJulian storage porn: https://plus.google.com/u/0/photos/113457525690313682370/albums/6021614279363431345
18:44 Christian87 its not available in public
18:45 JoeJulian https://plus.google.com/photos/113457525690313682370/albums/6021614279363431345?authkey=CKHarMm2oOKm2gE
18:46 Christian87 not bad ;)
18:48 JoeJulian 4 of those trays per server, 4 servers per rack, that 15 rack module is being populated. Then there's the one in New Jersey with more coming up in Ohio, London and Singaport.
18:48 JoeJulian s/port/pore/
18:48 glusterbot What JoeJulian meant to say was: 4 of those trays per server, 4 servers per rack, that 15 rack module is being populated. Then there's the one in New Jersey with more coming up in Ohio, London and Singapore.
18:49 qdk joined #gluster
19:46 ron-slc joined #gluster
20:41 DV joined #gluster
20:49 cmtime JoeJulian do you like those jbod's ?  Right now we run 24 bay supermicro's with one 12 thread xeon.
20:50 JoeJulian cmtime: Mostly, though with the density, heat's a challenge.
20:51 cmtime cpu could be too
20:51 cmtime depending on what you are doing
20:52 diegows joined #gluster
20:52 JoeJulian It's an external SAS enclosure, so CPU is fine. That's all OCP gear. You can see two servers between 8 trays (two enclosures). There are three 2u servers side-by-side in the rack.
20:53 cmtime Ya  I went to check out the site that makes them.
20:54 cmtime What I mean is depending on how you use the jbod if you hang it off a server 30-120 drives can be a lot to deal with.
20:54 cmtime Like in my case mdadm with 24 drives and gluster with a lot going on can take a 12 thread xeon to 100% easy
20:54 JoeJulian In it's current configuration, this is what I'm getting: https://plus.google.com/u/0/+JoeJulian/posts/4oM1yGG3h88
20:55 JoeJulian That's mdadm raid 6
20:55 JoeJulian The compute nodes, of course, are not on the GlusterFS servers.
20:56 cmtime how many drives in the raid 6?
20:56 JoeJulian 15 per brick. 4 bricks.
20:57 cmtime cool
20:57 cmtime It looks a lot like something I am building
20:57 JoeJulian The CPU has had no issue with load at all. I have more problems with heat. Like when building (or rebuilding) an array, I can't turn up the rebuild rate or the drives start getting hot.
20:58 cmtime Ya that would be scary for me
20:58 JoeJulian I wish they had some sort of heat-sync.
20:58 cmtime I have been through to many data center's with AC loss and fires
20:59 cmtime Not wanting to melt $40,000 in drives
20:59 JoeJulian https://www.youtube.com/watch?v=y66RUOmX9Pw
20:59 glusterbot Title: IO.Anywhere® Modular Data Center - YouTube (at www.youtube.com)
20:59 JoeJulian That's one of the modules we've populated.
20:59 JoeJulian It's an impressive facility.
21:00 cmtime I realized I think we are kinda near each other I am in Kitsap
21:00 JoeJulian Oh, cool. Yeah, I work from home in Edmonds.
21:00 cmtime same I work from home
21:00 cmtime gear is in Montreal
21:00 cmtime used to be in the westin years ago
21:03 cmtime http://www.supermicro.com/products/system/2U/2027/SYS-2027TR-H72QRF.cfm
21:03 glusterbot Title: Supermicro | Products | SuperServers | 2U | 2027TR-H72QRF (at www.supermicro.com)
21:03 cmtime thats what I am doing for compute nodes
21:05 JoeJulian Cool. I'm not a huge fan of SM. I know I'm supposed to be, but I just can't bring myself to it. Seems overpriced and so proprietary that if anything goes wrong or needs upgraded, throw it away and buy a new one.
21:05 cmtime Umm yes /no
21:06 cmtime here are my bricks http://www.supermicro.com/products/chassis/4U/846/SC846BE16-R1K28.cfm
21:06 glusterbot Title: Supermicro | Products | Chassis | 4U | SC846BE16-R1K28B (at www.supermicro.com)
21:06 cmtime so as they get older and I need new upgrade backplanes thats $600 to upgrade the case.
21:07 cmtime I mean ya the case is $1200 but we are looking at it as a 4-8 year investment
21:07 JoeJulian That's not as bad as last time I used SM.
21:08 cmtime And with that case we have 2 separate boot drives in the back for os.  And in the front we have the 24 data drives that I do two 11 driver raid 5s to build a raid 50 with.
21:08 Christian87 when a brick fails very can i see that it was detected (ubuntu 14.04)
21:10 cmtime Christian87 with "gluster volume status"  you should be able to see
21:11 cmtime Some checks for nagios are out on the web too to monitor it all the time.
21:12 cmtime JoeJulian, the thing I was looking at last week was to just got full infiniband in the colo and do tcp/ip over IB with two redundant ib to tcp gateways
21:13 JoeJulian Oooh, nice.
21:14 Christian87 mh there is nothing that shows me that a brick failed
21:14 Christian87 https://gist.github.com/w0bble/9e69364abcb5d6e2892e
21:14 glusterbot Title: gist:9e69364abcb5d6e2892e (at gist.github.com)
21:14 cmtime I think I have that all solved now but man it was complex to find all I need to understand and move from 100 vlan network to IB with 100 p_keys with little docs on how it all works.
21:14 Christian87 it just disappear fro mthe status page
21:15 JoeJulian Christian87: What version?
21:18 cmtime JoeJulian, do you know when they plan to fix quota in gluster?  I want to use it for billing so I do not have to spend months doing du -sh *
21:19 JoeJulian cmtime: When the bugs are filed, usually.
21:19 JoeJulian Do you have a bugzilla entry that's stale?
21:21 cmtime no I saw one open but it has a on going problem
21:21 cmtime when I turn it on all the nodes crash and stop talking to each other
21:21 JoeJulian Oh, I've not heard of that at all.
21:21 JoeJulian This is the part of the conversation where I chastise you for not filing a bug report.... ;)
21:22 JoeJulian file a bug
21:22 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
21:23 cmtime lol I promise a bug is made by someone I saw it lol
21:25 JoeJulian There was something that may have been similar with someone that turned off quota then immediately turned it back on in 3.5. https://botbot.me/freenode/gluster/2014-07-24/?msg=18573494&page=4
21:25 glusterbot Title: Logs for #gluster | BotBot.me [o__o] (at botbot.me)
21:25 cmtime But ya it sucks because you turn quota on.  On some it looks okay for a few minutes then boom you have to turn it off on all nodes before they will talk with each other again.  It causes glusterfs to just hang.  I can only reproduce it with my production glusters and not in testing.
21:25 Christian87 @JoeJulian glusterfs 3.4.4
21:25 cmtime I will have to try to dig it up if not I will file a post next week
21:26 JoeJulian Christian87: Ah, ok. That feature has been added in 3.5. 3.4 just expects you to use some other monitoring software.
21:30 Christian87 but 3.5 isnt stable yet ?
22:09 diegows joined #gluster
22:25 firemanxbr joined #gluster
22:39 plarsen joined #gluster
22:41 Lyfe joined #gluster
23:11 Mick271 joined #gluster
23:11 Mick271 Hello folks
23:11 Christian87 left #gluster
23:12 Mick271 is there a known issue with debian 7 + the gluster nfs part eating a lot of ram
23:12 Mick271 seems like I did an aptitude upgrade last week or so and since my two servers are crashing because of some memory issue
23:13 JoeJulian what version?
23:13 Mick271 and looking at htop show me this process a 50% of mem
23:13 Mick271 3.4
23:14 JoeJulian Not aware of any memory leak bugs with 3.4. 3.4.5 was just released a few days ago, though. I don't know if anyone's built it for debian.
23:15 Mick271 seems like I am running 3.4.0-2
23:15 Mick271 is that a big step to migrate to 3.5 ?
23:15 JoeJulian Oh, then yes. There are known memory leaks.
23:15 Mick271 I remember seeing it is easier thant 3.3 to 3.4
23:15 JoeJulian Should be pretty simple.
23:15 Mick271 K
23:15 Mick271 I might go and do that then
23:16 Mick271 hmm my apt repo seems to specify 3.4.0 explicitely, wonder where I found this
23:18 msciciel1 joined #gluster
23:19 m0zes joined #gluster
23:19 hflai_ joined #gluster
23:20 wgao joined #gluster
23:22 Mick271 one thing, I had to change some stuff in my fstab recently
23:22 Mick271 I had to remove the use of v3 for nfs
23:22 Mick271 is this something supposed not to work on it's own ?
23:23 Mick271 again, beside aptitude upgrade I did not modify any conf
23:30 JoeJulian I'm not sure if we have a debian maintainer currently.
23:30 JoeJulian check download.gluster.org
23:32 Mick271 ok
23:32 Mick271 I found 3.4.4
23:32 Mick271 but I have to change my repo for every new version, first time I encounter this way of doing
23:34 Mick271 well there is a /latest/ that would probably fix this for me
23:51 diegows joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary