Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2014-08-25

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:28 Alex____1 joined #gluster
00:34 gildub joined #gluster
00:51 glusterbot New news from newglusterbugs: [Bug 1083963] Dist-geo-rep : after renames on master, there are more number of files on slave than master. <https://bugzilla.redhat.com/show_bug.cgi?id=1083963>
01:04 harish joined #gluster
01:07 mbukatov joined #gluster
01:17 glusterbot New news from resolvedglusterbugs: [Bug 768324] memory corruption in client process.[Release:3.3.0qa15] <https://bugzilla.redhat.com/show_bug.cgi?id=768324> || [Bug 805802] Replace-brick operations just exits with operation failed. <https://bugzilla.redhat.com/show_bug.cgi?id=805802> || [Bug 1022593] Dist-geo-rep : When node goes down and come back in master cluster, that particular session will be defunct. <https://bugzilla.redhat.com/show_bug.
01:18 vimal joined #gluster
01:21 glusterbot New news from newglusterbugs: [Bug 1101111] [RFE] Add regression tests for the component geo-replication <https://bugzilla.redhat.com/show_bug.cgi?id=1101111> || [Bug 1121072] [Dist-geo-rep] : In a cascaded setup, after hardlink sync, slave level 2 volume has sticky bit files found on mount-point. <https://bugzilla.redhat.com/show_bug.cgi?id=1121072> || [Bug 1075417] Spelling mistakes and typos in the glusterfs source <https://bugzilla.redhat.co
01:25 Alex____1 joined #gluster
01:27 aulait joined #gluster
01:30 MacWinner joined #gluster
01:31 MacWinner any large public examples of companies using gluster?  particularly customers who rely on their gluster storage gluster for their customer facing business
01:31 MacWinner like a box.net or similar?
01:32 m0zes I know pandora was, and probably still are...
01:33 MacWinner ahh, cool!
01:35 m0zes http://www.linux-magazine.com/Online/News/Pandora-Deploys-Gluster
01:36 sputnik13 joined #gluster
01:49 Alex I just want to double check something - is there a sensible way I can change the distribution method across bricks? I ended up in a situation where I had 4 bricks, each with 2.8TB of data written to them, but because the bricks were not equal sized, when one was full, I got an error writing to a folder stored on that brick
01:49 Alex What I'd like is if the data was distributed equally across the bricks by %age of total usage, rather than absolute usage, IYSWIM
01:56 topshare joined #gluster
01:56 haomaiwa_ joined #gluster
01:58 harish joined #gluster
02:04 sputnik13 joined #gluster
02:09 neofob left #gluster
02:12 haomai___ joined #gluster
02:37 cristov joined #gluster
02:41 plarsen joined #gluster
02:57 topshare joined #gluster
03:02 gildub joined #gluster
03:02 bala joined #gluster
03:04 kshlm joined #gluster
03:16 sputnik13 joined #gluster
03:34 sputnik13 joined #gluster
03:40 ppai joined #gluster
03:47 shubhendu_ joined #gluster
03:54 itisravi joined #gluster
04:06 prasanth_ joined #gluster
04:16 gildub joined #gluster
04:24 saurabh joined #gluster
04:27 raghu` joined #gluster
04:27 raghu` joined #gluster
04:34 Rafi_kc joined #gluster
04:35 RameshN joined #gluster
04:36 anoopcs joined #gluster
04:36 atinmu joined #gluster
04:44 jiku joined #gluster
04:58 ramteid joined #gluster
05:02 kdhananjay joined #gluster
05:13 rastar joined #gluster
05:13 ndarshan joined #gluster
05:13 shubhendu joined #gluster
05:27 spandit joined #gluster
05:30 jiku joined #gluster
05:33 meghanam joined #gluster
05:33 meghanam_ joined #gluster
05:34 rastar joined #gluster
05:37 karnan joined #gluster
05:42 topshare joined #gluster
05:48 ws2k33 joined #gluster
05:57 kanagaraj joined #gluster
05:59 bala joined #gluster
05:59 nishanth joined #gluster
06:00 topshare joined #gluster
06:02 nshaikh joined #gluster
06:03 rtalur_ joined #gluster
06:08 haomaiwang joined #gluster
06:09 lalatenduM joined #gluster
06:16 kumar joined #gluster
06:17 haomai___ joined #gluster
06:21 ctria joined #gluster
06:24 XpineX joined #gluster
06:26 atinmu joined #gluster
06:34 kshlm joined #gluster
06:37 saurabh joined #gluster
06:39 bala joined #gluster
06:39 hagarth joined #gluster
06:45 atalur joined #gluster
06:48 atinmu joined #gluster
06:56 rtalur_ joined #gluster
06:59 bala joined #gluster
07:00 prasanth_ joined #gluster
07:04 kshlm joined #gluster
07:04 hagarth1 joined #gluster
07:06 deepakcs joined #gluster
07:08 ricky-ti1 joined #gluster
07:10 aravindavk joined #gluster
07:11 andreask joined #gluster
07:30 fsimonce joined #gluster
07:32 andreask joined #gluster
07:33 andreask joined #gluster
07:34 jiffin joined #gluster
07:35 bala joined #gluster
07:49 bala joined #gluster
07:53 glusterbot New news from newglusterbugs: [Bug 1130888] Renaming file while rebalance is in progress causes data loss <https://bugzilla.redhat.com/show_bug.cgi?id=1130888>
07:56 social joined #gluster
07:57 liquidat joined #gluster
08:05 Thilam joined #gluster
08:10 atinmu joined #gluster
08:11 aravindavk joined #gluster
08:12 hagarth joined #gluster
08:29 dastar joined #gluster
08:41 spandit joined #gluster
08:49 karnan joined #gluster
08:53 glusterbot New news from newglusterbugs: [Bug 1133464] xml output needed for geo-rep CLI commands <https://bugzilla.redhat.com/show_bug.cgi?id=1133464>
09:13 kshlm joined #gluster
09:18 harish_ joined #gluster
09:27 karnan joined #gluster
09:31 deepakcs joined #gluster
09:36 rwheeler joined #gluster
09:47 gildub joined #gluster
09:48 aulait joined #gluster
10:06 ppai joined #gluster
10:18 ekuric joined #gluster
10:29 atinmu joined #gluster
10:32 edward1 joined #gluster
10:46 nshaikh joined #gluster
10:54 mhoungbo joined #gluster
10:59 ramons joined #gluster
11:00 kkeithley1 joined #gluster
11:02 ramons joined #gluster
11:03 ira joined #gluster
11:05 tdasilva joined #gluster
11:19 glusterbot New news from resolvedglusterbugs: [Bug 1132116] cli: -fsanitize heap-use-after-free error <https://bugzilla.redhat.com/show_bug.cgi?id=1132116>
11:32 dusmant joined #gluster
11:48 atinmu joined #gluster
11:50 ppai joined #gluster
11:51 HoloIRCUser joined #gluster
11:58 nishanth joined #gluster
11:59 LebedevRI joined #gluster
12:04 ramon_dl joined #gluster
12:05 B21956 joined #gluster
12:10 ramon_dl_ joined #gluster
12:11 cerebuss joined #gluster
12:11 zerick joined #gluster
12:12 bala joined #gluster
12:15 cerebuss beginner with gluster here, trying to do my first setup. if i have 3 servers (to grow later) and want a replica count of 2, i take it I need to create at least 2 bricks on each server? how would i know which bricks are the replicas of which so i can move a brick if it has its replica on the same server?
12:20 rwheeler joined #gluster
12:22 andreask joined #gluster
12:23 andreask joined #gluster
12:24 plarsen joined #gluster
12:24 HoloIRCUser cerebuss: the order you specify the bricks in the create volume command determine the replica pairs. The cli will warn if it considers the bricks are on the same node I believe.
12:26 cerebuss ahh ok, thanks. can i later see which bricks are replicas as well?
12:26 calum_ joined #gluster
12:33 andreask joined #gluster
12:35 LHinson joined #gluster
12:36 chirino joined #gluster
12:38 LHinson1 joined #gluster
12:40 HoloIRCUser1 joined #gluster
12:41 Jamoflaw joined #gluster
12:44 HoloIRCUser1 Gluster volume info volumename I think
12:44 HoloIRCUser1 Should list the bricks in order
12:46 Jamoflaw- And from the order you can determine the distribute pairs
12:48 theron joined #gluster
12:48 hybrid512 joined #gluster
12:49 dusmant joined #gluster
12:49 cerebuss thanks! easy enough ... saw the list already but didnt know order determined pairs
12:51 Jamoflaw haven't quite worked it out with stripe yet though as I haven't used that yet
12:52 plarsen joined #gluster
12:55 ricky-ticky1 joined #gluster
13:01 theron joined #gluster
13:05 aravindavk joined #gluster
13:05 hypnotortoise joined #gluster
13:06 sage joined #gluster
13:11 hypnotortoise can you direct-mount a subdirectory on a gfs volume via glusterfs?
13:17 vimal joined #gluster
13:17 recidive joined #gluster
13:21 HoloIRCUser1 joined #gluster
13:23 tdasilva joined #gluster
13:24 sniper joined #gluster
13:24 ekuric joined #gluster
13:25 Ark joined #gluster
13:26 BrandEmbassyLtd Hello gluster, I have one question - can I somehow change replica count in cluster? Ie. I have 4 nodes with replicated volume (replica 4) and I would like to add another node and change replica to 5, so every data on volume will be replicated to new node too. is this possible?
13:27 BrandEmbassyLtd I would like to ensure that data is available when only one node of cluster is available
13:28 mojibake joined #gluster
13:29 bala joined #gluster
13:29 theron joined #gluster
13:36 Ark_ joined #gluster
13:40 cerebuss BrandEmbassyLtd, you can set replica-count at the same time as you add the brick "volume add-brick <VOLNAME> [<stripe|replica> <COUNT>] <NEW-BRICK> ... [force] "
13:41 BrandEmbassyLtd cerebuss, thanks, so this will rewrite older setting of replica count?
13:42 cerebuss yes, it worked well for me from 1 to 2 at least, but its my first day messing with clusterfs so dont trust me too much ;)
13:42 ndevos hypnotortoise: no, that is not possible yet, you can only mount the complete volume
13:46 ramon_dl joined #gluster
13:47 ramon_dl left #gluster
13:48 kombucha joined #gluster
13:48 kombucha morning everyone
13:50 ramon_dl joined #gluster
13:50 kombucha when I do gluster peer status, 1 of the peers has an IP address, not a host name
13:50 kombucha how do I change this?
13:51 kombucha It doesn't work by IP, since that can change based on the infra it's running on
13:51 coredump joined #gluster
13:53 ramon_dl if gluster peer status shows ip address of a node try to repeat peer probe to it from another node
13:55 ramon_dl remember your dns o /etc/hosts on each node must resolve ipnode<-->name
13:55 glusterbot ramon_dl: ipnode<'s karma is now -1
13:56 kombucha I just tried to do that, now I have 2 peers instead of 1. The original 1 has its IP address for the hostname, but the 2nd/new one has all zeros for the UUID (00000000-0000-0000 etc)
13:57 kombucha Also gluster peer status didn't come back, I had to break, and the new peer has status "State: Establishing Connection (Connected)"
13:58 ramon_dl Do you use gluster peer probe node_name?
13:59 LHinson joined #gluster
14:00 kombucha yes, it looks like there is a glitch/bug in there with the hostname IP address, bc it seemed to think it had 2 peers with the same name
14:01 kombucha I have detached them both by issuing peer detach in succession
14:01 kombucha and after doing that, I can correctly add the peer back
14:01 ramon_dl great!
14:02 kombucha aaand my transport endpoint is not connected error has gone away
14:02 kombucha I'm on 3.2, btw, haven't had the time (or courage, lol) to upgrade yet
14:02 ramon_dl Do you have dns or /etc/hosts name resolution?
14:04 julim joined #gluster
14:06 bennyturns joined #gluster
14:08 kombucha yes, hosts entries are there
14:08 kombucha here's what I'm seeing: http://privatepaste.com/b60ac32af0
14:08 glusterbot Title: privatepaste.com :: Paste ID b60ac32af0 (at privatepaste.com)
14:10 kombucha weird, now peer 1 has changed the hostname it show for peer 2 into peer 2's IP address. I'm fairly sure it had the host name when I just did peer status a minute ago
14:12 hawksfan joined #gluster
14:13 kombucha the peers are definitely talking, I can start/stop the volume from either one
14:13 ramon_dl It seems you can't access to mount point but it seems correctly mounted
14:13 kombucha however it's giving "Input/output error" which it wasn't doing before
14:14 hawksfan i have 20 servers with 5 TB of data on each
14:14 anoopcs joined #gluster
14:14 hawksfan is there a way to convert them to gluster without re-copying the data onto the servers?
14:15 kombucha the problem before was simple, the peers were not talking bc the IP hostname mapping was wrong
14:15 kombucha and gluster somehow let me add the same hostname with a different IP.  That was easy enough to fix by removing them both and adding the correct one back.
14:16 kombucha So then they were talking to each other again
14:16 ramon_dl Do you have data in "fileshare" volume?
14:19 HoloIRCUser1 joined #gluster
14:19 ramon_dl hawksfan: I don't know any way. Files inside a gluster volume need to be distributed with elastic hash algorithm and must have appropiate extented attributes.
14:19 cerebuss joined #gluster
14:20 ramon_dl kombucha: if no data, simple way is destroy volume and re-build it
14:21 kombucha I have the data backed up to my home directory, so I could destroy and repopulate, I think
14:22 kombucha Is there a good step by step I could review to make sure I am doing everything right?
14:22 haomaiwang joined #gluster
14:22 hawksfan @ramon_dl: thanks for the info - good to know for sure, than chasing after it
14:23 theron_ joined #gluster
14:24 wushudoin joined #gluster
14:24 ramon_dl1 joined #gluster
14:24 kombucha ah, it still has the UUID of the old peer!!
14:24 kombucha but it was showing it by hostname, not IP address
14:24 kombucha so I didn't realize it was doing that, until I checked the UUID
14:25 kombucha How do I *completely* remove that peer so that gluster has no memory of it?
14:25 kombucha I want to re-add it as a "new" peer, with the same hostname as the old peer, but a different IP address
14:25 ramon_dl1 left #gluster
14:26 kombucha Also, it oddly says it's connected to the peer, so maybe it is the new peer (since the old peer is powered down)
14:27 kombucha So the UUID was misleading, apparently it can be the same
14:28 ramon_dl joined #gluster
14:28 glusterbot` joined #gluster
14:29 eclectic_ joined #gluster
14:29 chirino joined #gluster
14:31 xleo joined #gluster
14:31 neoice_ joined #gluster
14:31 abyss_ joined #gluster
14:31 cicero_ joined #gluster
14:32 JustinCl1ft joined #gluster
14:32 bfoster_ joined #gluster
14:32 nthomas joined #gluster
14:32 l0uis_ joined #gluster
14:33 tg2 joined #gluster
14:33 verdurin joined #gluster
14:33 fsimonce joined #gluster
14:34 T0aD joined #gluster
14:36 R0ok_ joined #gluster
14:37 gmcwhistler joined #gluster
14:46 bet_ joined #gluster
14:48 jobewan joined #gluster
14:49 wgao_ joined #gluster
14:50 gmcwhist_ joined #gluster
14:50 SpComb joined #gluster
14:58 hagarth joined #gluster
15:01 _Bryan_ joined #gluster
15:04 chirino joined #gluster
15:06 hypnotortoise ndevos: thanks. can you point me to the feature request (if there is one)?
15:08 ndevos hypnotortoise: that would be bug 892808, I think
15:08 glusterbot Bug https://bugzilla.redhat.com:443/show_bug.cgi?id=892808 low, low, ---, aavati, NEW , [FEAT] Bring subdirectory mount option with native client
15:09 lmickh joined #gluster
15:15 ramon_dl kombucha: I'm busy now, sorry!  take a look at /var/lib/glusterd/peers. stop gluster (both nodes, glustrd, glusterfs, glusterfsd) , modify node  name, restart
15:16 kombucha thanks ramon_dl I appreciate it
15:18 LHinson joined #gluster
15:23 daMaestro joined #gluster
15:24 glusterbot New news from newglusterbugs: [Bug 892808] [FEAT] Bring subdirectory mount option with native client <https://bugzilla.redhat.com/show_bug.cgi?id=892808>
15:25 ramon_dl joined #gluster
15:33 calum_ joined #gluster
15:35 bennyturns joined #gluster
15:39 Frank77 How are exactly handled write operations. What happens in a replicated volume when one of the two servers is much faster than the other ? 1st : 15MB/s and the 2sd 240 MB/s.
15:40 sputnik13 joined #gluster
15:51 cmtime JoeJulian, when you are around I need to seek some help from you.
16:07 dtrainor joined #gluster
16:10 PeterA joined #gluster
16:12 PeterA any clue about why this could happened?
16:12 PeterA [marker.c:2482:marker_setattr_cbk] 0-sas03-marker: Operation not permitted occurred during setattr of /TrafficPrDataFc01//TrafficCost/muo/costdtl_muo_20140825054411_011.bcp
16:13 PeterA https://bugzilla.redhat.com/show_bug.cgi?id=1037511
16:13 glusterbot Bug 1037511: high, unspecified, ---, vbellur, NEW , Operation not permitted occurred during setattr of <nul>
16:15 cwray joined #gluster
16:21 recidive joined #gluster
16:23 aravindavk joined #gluster
16:24 ir8 Anyone around?
16:34 zerick joined #gluster
16:38 jbrooks left #gluster
16:44 aravindavk joined #gluster
16:45 Ark joined #gluster
16:47 hagarth joined #gluster
16:54 jbrooks joined #gluster
17:03 PeterA just started getting this error today
17:03 PeterA .2014-08-25 16:55:17.439143] E [server-rpc-fops.c:796:server_getxattr_cbk] 0-sas03-server: 442815: GETXATTR <gfid:9e6fa8e9-d5ae-4242-9c20-5fbf57f7778c> (9e6fa8e9-d5ae-4242-9c20-5fbf57f7778c) ((null)) ==> (Permission denied)
17:03 PeterA how can i find out the gfid?
17:04 PeterA [marker.c:327:marker_getxattr_cbk] 0-sas03-marker: dict is null
17:05 sputnik13 joined #gluster
17:10 semiosis hello
17:10 glusterbot semiosis: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
17:10 semiosis ir8: ^^^
17:11 recidive joined #gluster
17:18 PeterA hi semiosis
17:18 Slasheri joined #gluster
17:18 semiosis aloha
17:18 PeterA do u happen to seen this error before??
17:19 semiosis no
17:19 PeterA [marker.c:327:marker_getxattr_cbk] 0-sas03-marker: dict is null
17:19 PeterA i wonder how can i start looking into the cause
17:19 PeterA .2014-08-25 16:55:17.439143] E [server-rpc-fops.c:796:server_getxattr_cbk] 0-sas03-server: 442815: GETXATTR <gfid:9e6fa8e9-d5ae-4242-9c20-5fbf57f7778c> (9e6fa8e9-d5ae-4242-9c20-5fbf57f7778c) ((null)) ==> (Permission denied)
17:19 PeterA just started happen today again...
17:19 mhoungbo joined #gluster
17:20 PeterA how do i identify the gfid of these
17:21 semiosis perhaps with the ,,(gfid resolver)
17:21 glusterbot https://gist.github.com/4392640
17:21 PeterA cool let me try
17:43 21WAA3GC7 joined #gluster
17:43 DV joined #gluster
17:44 doo joined #gluster
17:44 _dist joined #gluster
18:11 kmai007 joined #gluster
18:11 kmai007 JoeJulian: are you here?
18:12 PeterA it seems like we getting this error [marker.c:327:marker_getxattr_cbk] 0-sas03-marker: dict is null
18:12 PeterA whenever we have a mod 700 or 600 file modified
18:12 PeterA and we getting the setattr error when we have directories or file with g+s
18:13 PeterA is that a know bug on 3.5.2 ubuntu??
18:15 theron joined #gluster
18:17 theron_ joined #gluster
18:17 foster joined #gluster
18:18 recidive joined #gluster
18:24 _dist PeterA: did you by any chance live add or remove a brick? I had that same problem, only after doing that
18:29 ira joined #gluster
18:32 chirino joined #gluster
18:36 recidive joined #gluster
18:38 PeterA no i did not
18:39 PeterA _dist: did the error able to go away?
18:39 kmai007 i was told if you run a find . on that volume mount froma client it should fix your issue
18:40 kmai007 it was an old email thread, you could possible google it?, but that was for 3.4.1, i'm not sure about your version
18:40 PeterA from a glusterfs client?
18:40 kmai007 yessir
18:40 PeterA i m on ubuntu 3.5.2
18:41 kmai007 i haven't used 3.5.2, but is the error giving you problems, or is it just logging alot of noise?
18:41 PeterA logging noice and wonder if it creates potential problems
18:42 kmai007 thats a tough one, i'm not on that release yet, running the find . won't ruin you, you can give it a try
18:44 _dist PeterA: no until I rebuilt the volume from scratch
18:44 PeterA kmai007: i am running the find now
18:44 PeterA _dist: wow that sucks....
18:46 PeterA the catch is when i run the getfattr -d -m . -e hex against the dirs, it shows nothing....
18:46 PeterA i would expect to see the trusted.id
18:47 kmai007 the find . i think will trigger the seal-heal
18:47 kmai007 self*, and it would maybe rebuild those gettars
18:49 PeterA ah….i will check after the find is finished
18:50 PeterA thank you very much on advice :)
18:50 kmai007 no prob. good luck, always remember
18:51 kmai007 you can offer your problem to the mailing list, and someone more intelligent than me can respond back
18:51 PeterA sure!
18:52 lyang0 joined #gluster
18:53 nico_ joined #gluster
18:53 nico_ hello to all
18:54 kmai007 whats your question nickmoeck
18:54 kmai007 nico_
18:54 nico_ anybody has experience with performance over gluster with many small files
18:54 kmai007 funny i use to say hi, and i got a scolding from glusterbot:
18:55 kmai007 everybody has had some experience with small files, large files, # of files on glusterfs
18:55 kmai007 in production
18:55 PeterA nico_: i got a hit by rpc lookup with a lot of small files in a dir
18:55 kmai007 what is "a lot"
18:55 kmai007 200k ?
18:56 kmai007 i had 1.4 million and that was not a good experience
18:56 kombucha joined #gluster
18:57 nico_ the installation is in one client who work with video animation so they have many small files
18:58 nico_ in they proyects.  So copy over glusterfilesystem mounted with samba its really slow
18:58 kmai007 why use samba? does it have a windows componenet?
18:58 nico_ the network its 10gbs, copy a single file or file of size DVD copy reasonable
18:59 semiosis nico_: are you using the ,,(samba vfs) plugin?
18:59 glusterbot nico_: I do not know about 'samba vfs', but I do know about these similar topics: 'sambavfs'
18:59 semiosis ,,(sambavfs)
18:59 glusterbot http://lalatendumohanty.wordpress.com/2014/02/11/using-glusterfs-with-samba-and-samba-vfs-plugin-for-glusterfs-on-fedora-20/
18:59 nico_ yes, the backend its windows, The workstation mount the share network unit
19:00 kombucha_ joined #gluster
19:01 kmai007 semiosis: i can use the samba vfs pluggin without CTDB?
19:01 semiosis kmai007: i dont know
19:01 kmai007 ok...just wondering
19:01 nico_ semiosis:  no not using samba vfs plugin, Just FUSE mount
19:01 theron joined #gluster
19:02 semiosis nico_: well you might want to try that, especially if you're seeing high CPU usage
19:02 semiosis on the samba server
19:02 kombucha_ joined #gluster
19:03 nico_ the cpu its normal, not heavy load, but the speed of transfer its really slow with a proyect folder with many files inside
19:03 semiosis nico_: you might also want to disable atimes on your bricks (mount with noatime,nodiratime).  if there's a way to disable atimes in samba too, do that
19:03 nico_ the same folder with tar.gz copy great
19:04 nico_ semiosis:  Ok, thanks for the comments i will take a look of this topics
19:08 nico_ semiosis: Do you think give a try of tunning performance.cache options also??
19:09 semiosis nico_: maybe.  i guess it's worth a try
19:09 nico_ or any change to the defaults configuration to optimize the heavy with tiny files
19:13 warci joined #gluster
19:13 semiosis doubt it
19:20 bene2 joined #gluster
19:21 warci hello all, i'm seeing %util when running iostat on my gluster server always close to 100%... what did i do wrong?
19:22 warci the server is connected to a big san, so the disk speed should be good, i'm guessing i made a wrong decision in the config somewhere?
19:30 _dist warci: a little more info on your setup?
19:30 MacWinner joined #gluster
19:30 foster joined #gluster
19:36 kmai007 does it always run at 100%? or are your clients doing work
19:36 kmai007 against the volume
19:37 kmai007 are you doing replication on the backend SAN,
19:38 kmai007 like _dist said, more detail on your setup is helpful.
19:39 kmai007 # of storge servers,
19:39 kmai007 version of gluster
19:39 kmai007 all that kind of info
19:42 semiosis glusterfs over SAN is unusual, possibly misguided.
19:45 longshot902 joined #gluster
19:49 B21956 joined #gluster
19:53 _dist semiosis: that's a surprising statement :)
19:53 zerick joined #gluster
19:58 * semiosis full of surprises
20:01 bene joined #gluster
20:02 bene joined #gluster
20:06 LHinson joined #gluster
20:08 warci well, i'm running gluster 3.4.2-1
20:09 recidive joined #gluster
20:09 warci the server is a vmware host, and my storage is attached as a raw lun (25 TB xfs volume)
20:09 Ark joined #gluster
20:09 warci the machine is running fine, no cpu or memory issues
20:10 warci the thing is, i'm not an expert at interpreting these io statistics, but everything looks quite ok, except that %util thing
20:11 warci i'm not doing any replication or anything, just a simple volume
20:12 warci right now i'm seeing 100% util, but 9% iowait
20:12 semiosis what kind of workload?
20:13 warci it's kind of varied, mostly small to medium size files
20:13 warci we don't have a lot of client connected though
20:13 warci only 20-something actually doing very light work
20:14 semiosis warci: why not just use NFS?
20:14 _dist warci: what is the XFS stored on?
20:14 warci well, i liked the gluster concept so we can easily replicate to drp and add bricks on our different storage systems
20:14 warci the XFS is stored on an IBM XIV G3
20:14 warci it's quite powerful
20:15 _dist warci: I meant the disk setup, hw raid, etc # of disks and layout. But 20 VMs isn't too much workload (depending of course)
20:16 warci also the worm & read only stuff is handy for us, so i prefer to keep gluster
20:16 warci mmmm disk layout is tricky, it's a black box
20:16 semiosis warci: the XIV can't do all that?
20:16 warci we need a nas solution
20:16 warci xiv is only san
20:16 semiosis tbh most people use glusterfs on commodity hardware, so it's kinda hard to figure whats going on in your setup
20:17 warci yeah, i know it's not the ideal setup
20:17 _dist warci: also I'm pretty sure 3.4.2-1 will end up having a couple of issues hosting vms such as false heal info and potential issues when dynamically adding/removing bricks
20:17 warci but the disk system is very powerful, so i kinda hoped to get away with running only one node
20:17 warci i already noticed those issues :)
20:18 warci i'll upgrade to 3.5 asap, but first i need to know if we can keep using gluster for our infrastructure
20:18 semiosis why would you even need to add/remove glusterfs bricks?  can't the XIV grow/shrink a volume?
20:18 semiosis i'd imagine that's a basic feature of any such array
20:19 semiosis btw, do you pronounce XIV like ziv or like fourteen?
20:19 warci yeah, but we have several disk systems, so it's nice to be able to bundle them
20:19 warci we have 2 xiv's and a big dell thing
20:19 warci it's more like exaaiveee
20:21 warci it's basically a big gluster box by ibm with a nice management gui ripped off from apple :)
20:21 warci i'm trying to replace our ancient netapp with gluster
20:21 warci and feature-wise it's doing an excellent job
20:25 semiosis warci: so, back to your perf issue... you could strace the process using all the CPU.  that should give some indication whats going on
20:26 warci well, it's not really a cpu issue, that's at 2%
20:26 semiosis [16:12] <warci> right now i'm seeing 100% util, but 9% iowait
20:26 _dist it sounds like your disks are at 100% util right?
20:26 semiosis then what is your issue?
20:27 _dist semiosis: I think he mean %util in iostat like
20:27 warci yeah
20:27 semiosis ah
20:27 warci yeah, disk utilization
20:27 semiosis well, who knows what that means
20:27 warci i notice when i'm at 100% clients begin to experience issues
20:27 semiosis how would it even know the limits?  so many layers of abstraction
20:28 _dist that's why I was asking about the raid setup, warci: how are you mounting the storage from your vmware host (it was vmware right?)
20:28 semiosis you've got VM hypervisor, then network presumably, then whatever is going on inside XIV
20:29 warci yes, so we have an 8G FC connection to our vmware esx servers
20:29 diegows joined #gluster
20:29 warci from the nas
20:29 warci the rest is just gigabit network
20:29 _dist warci: what protocol is VMWare mounting with?
20:29 _dist (iSCSI?)
20:30 warci fibre channel raw lun
20:30 warci 8GB
20:30 warci x2
20:30 _dist ah
20:31 warci i'm wondering, is there some limitation on the kernel side as the lun is only seen as one device?
20:31 warci because our physical servers all use multipathing
20:31 warci and our xiv has 8 channels, so 8 luns on the linux side for each exported volume
20:31 warci but ofcourse, vmware can only use one
20:32 warci i mean, i know i'm abusing gluster a bit in this configuration, but our hardware is kinda overkill for what i'm trying to do, no?
20:33 Ark joined #gluster
20:33 warci btw: when %util is at 100% for a while, my iowaits grow untill +- 24%
20:33 warci but only after an hour or so
20:34 warci but right now i'm running some rsyncs, so it's a high iops operation
20:36 warci just fyi, the xiv has 9 shelves with 12 disks of 4 TB
20:36 warci data distribution is handled internally and can not be affected by the user
20:36 _dist warci: I really don't think you're maxing out 2x8gb channels. I would suspect your disks are actually being maxed out. When you say 9 shelves with 12 disks you mean 9*12 disks?
20:37 warci yes
20:37 warci these are all servers connected internally with infiniband
20:37 warci and a huge cache memory
20:38 warci but the disks are slow sata
20:38 _dist yeah but 20 vms would never max that, unless there's a misconfig. I think semiosis is right and you need to do strace, run an iotop and get the pids from there
20:39 warci ok.. i'll see what i can find out
20:39 warci but right now after business hours it's doing nothing
20:39 warci and i'm running 10 rsyncs, and it's enough to flood the server
20:40 warci i'm not a specialist at all, but that seems a bit quick, no?
20:41 _dist my best hunch would be that your vms are running with no local cache and forcing full sync on all writes but there are so many varibales in your setup
20:42 warci hmm that's indeed tricky to find out
20:42 _dist but yeah, unless the raid is a stupid setup 108 7200rpm disks should definitely handle 20vms doing _anything_
20:43 _dist you shouldn't be seeing 100% util unless you've got failing disks
20:43 _dist unless those same disks are doing other stuff too, we didn't cover that
20:45 _dist I'm running 32 VMs on a 3-way mirror of 15 disks (WD reds) and my %util rarely goes above 50% on any of them, my iowait peaks around 9% during the worst DB jobs
20:45 warci hmm could it be that gluster just adds a lot of overhead because of metadata operations?
20:46 _dist there's no question that it adds overhead, xattr r/w mostly in my experience. I find I get about 80% native speed and iops with about 3x the cpu.
20:46 warci the whole xiv is doing 6000 iops atm and the lun for gluster does about 1000
20:48 warci so you guys are still thinking more that the bottleneck is the disk system?
20:48 _dist 6000iops is a lot on disks that may only have 50-60 of random r/w iops.
20:48 warci at business hours we get 20K iops
20:49 warci but that's mostly cache hits
20:49 _dist ah, you have some kind of cache acceleration then?
20:50 _dist wel, I guess that's not neccessary with 108 disks, it just depends on the "type" of workload, very random workloads might bring a disk down to 40iops that would otherwise give you 200iops
20:50 warci it has about 800GB of cache, but ofcourse the stuff i'm doing is ultra random, so i'm guessing it needs a lot of disk access
20:51 warci that might explain it
20:51 warci mmmm so fully screwed :)
20:52 warci what i still suspect, but those ibm guys don't want to confirm it, is that the xiv is capped at 1200 iops per lun
20:53 warci anyway guys, thanks a lot for brainstorming!
20:53 _dist maybe try something that would push it passed that, something easy that's sequential like a dd or a multiple "stress" run
20:53 warci i'll try to dig a bit deeper in the direction of the disk system
20:54 _dist good luck!
20:55 warci thanks! i'll report back if i find anything... i wish i had some more knowledge about all this stuff
20:56 _dist well, no better way to learn :)
20:57 warci true that :) this is all very fascinating stuff
20:58 wgao_ joined #gluster
21:09 kombucha joined #gluster
21:13 daMaestro joined #gluster
21:22 Ark Anyone know if the gluster hashing algorithm for placing files into bricks has been updated in 3.4.3 or 3.5.1?
21:22 Ark I have large files that do not split evenly everytime a backup happens (900GBs) leading to one side of the volume to fill faster then the other side.
21:32 semiosis Ark: how long are the filenames?
21:33 semiosis someone once told me that longer filenames distribute more evenly than short filenames
21:33 Ark ibdata0 7 letters
21:33 Ark yeah the directory has 15+ then the files that are dropped in are short, I can't really change that : /
21:34 semiosis pretty sure the hash only works on the filename, not including directories
21:35 sprachgenerator joined #gluster
21:37 cmtime Are you talking about you have 10 large files trying to split over say 4 servers and they are not ?
21:37 recidive joined #gluster
21:53 kombucha joined #gluster
21:59 jbrooks joined #gluster
22:00 jvandewege_ joined #gluster
22:00 Thilam|work joined #gluster
22:00 Lee_ joined #gluster
22:02 johnmark joined #gluster
22:05 kombucha_ joined #gluster
22:09 theron joined #gluster
22:11 PeterA got another error
22:11 PeterA E [posix-helpers.c:893:posix_handle_pair] 0-sas03-posix: /brick03/gfs/DevMordorHomeSata03//hcamara//custom_interim_reports//lumber_liquidators/weekly_report/lumber_liquidator_weekly_custom_pos.py: key:trusted.glusterfs.dht.linkto error:File exists
22:11 PeterA what is key:trusted.glusterfs.dht.linkto error:File exists ??
22:11 PeterA seems like hiting this bug??
22:11 PeterA https://bugzilla.redhat.com/show_bug.cgi?id=1030200
22:11 glusterbot Bug 1030200: medium, unspecified, ---, rhs-bugs, NEW , DHT : file rename operation is successful but log has error 'key:trusted.glusterfs.dht.linkto error:File exists' , 'setting xattrs on <old_filename> failed (File exists)'
22:12 social joined #gluster
22:13 pdrakeweb joined #gluster
22:14 johnmark joined #gluster
22:16 systemonkey joined #gluster
22:16 Rydekull joined #gluster
22:17 JustinClift joined #gluster
22:17 foobar joined #gluster
22:17 ninkotech joined #gluster
22:17 Rydekull joined #gluster
22:20 necrogami joined #gluster
22:27 glusterbot New news from newglusterbugs: [Bug 1037511] Operation not permitted occurred during setattr of <https://bugzilla.redhat.com/show_bug.cgi?id=1037511>
22:28 qdk joined #gluster
22:36 julim joined #gluster
22:39 recidive joined #gluster
23:25 doo joined #gluster
23:35 plarsen joined #gluster
23:37 rwheeler joined #gluster
23:43 sputnik13 joined #gluster
23:55 gildub joined #gluster
23:56 pdrakeweb joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary