Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2016-02-17

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:01 Wizek joined #gluster
00:01 haomaiwang joined #gluster
00:05 delhage joined #gluster
00:12 sjohnsen joined #gluster
00:15 ovaistariq joined #gluster
00:24 djgerm joined #gluster
00:25 ovaistariq joined #gluster
00:29 djgerm Should a client mount a VIP or something that round robins between servers in their fstab? or How do I mount a distributed filesystem?
00:31 djgerm backupvolfile-server= ?
00:31 gildub joined #gluster
00:34 djgerm looks like! http://www.gluster.org/community/documentation/index.php/Setting_Up_Clients
00:34 djgerm thanks!
00:54 JoeJulian djgerm: ,,(mount server)
00:54 glusterbot djgerm: (#1) The server specified is only used to retrieve the client volume definition. Once connected, the client connects to all the servers in the volume. See also @rrdns, or (#2) One caveat is that the clients never learn of any other management peers. If the client cannot communicate with the mount server, that client will not learn of any volume changes.
00:55 EinstCrazy joined #gluster
00:55 yosafbridge joined #gluster
00:55 EinstCrazy joined #gluster
00:56 djgerm Thanks JoeJulian.
00:57 djgerm that makes sense. and in case server1 is down, connect to server2....
00:57 djgerm specified in backupvol-file
00:58 djgerm volfile-server
00:58 djgerm and that'll then connect to all remaining servers that are up
00:58 JoeJulian correct
01:01 haomaiwa_ joined #gluster
01:12 EinstCrazy joined #gluster
01:13 hackman joined #gluster
01:14 hackman hi guys. I have a two node gluster setup that till yesterday was running over IPv4. I want to move it over IPv6
01:14 hackman I added "option transport.address-family inet6" to glusterd.vol
01:14 hackman and commented the IPv4 hosts in /etc/hosts
01:15 hackman but now the two nodes can't see each other
01:15 hackman :(
01:15 JoeJulian what version?
01:19 hackman glusterfs 3.7.8 built on Feb  9 2016 06:30:46
01:20 wwwbukolaycom joined #gluster
01:21 hackman its running on CentOS 6.5, and it is from the official gluster repo
01:21 JoeJulian Just making sure it wasn't during the "dark period" ;)
01:22 JoeJulian There's been a lot of work done on ipv6 support over the last couple versions and I still see there's work actively being done. My understanding was that it was supposed to be working now.
01:25 hackman I'm running gluster since... 2008 :) but IPv6 was always a problem :)
01:29 JoeJulian Hmm, I thought I remembered one of the devs saying it should work, but the only references I can find are targeting ipv6 support to 3.8.
01:30 hackman JoeJulian, it works... in pure IPv6 setups
01:30 hackman but unfortunately for this client... I can't turn off ipv4 on his machines
01:30 Lee1092 joined #gluster
01:31 JoeJulian I would recommend emailing nithind1988@yahoo.in
01:31 kdhananjay joined #gluster
01:31 hackman [2016-02-17 01:30:47.336350] E [MSGID: 101075] [common-utils.c:306:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known)
01:31 hackman [2016-02-17 01:30:47.336422] E [name.c:247:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host db2
01:31 hackman [2016-02-17 01:30:47.336580] I [MSGID: 106004] [glusterd-handler.c:5127:__glusterd_peer_rpc_notify] 0-management: Peer <db2> (<94bc240d-6c1d-4a64-8197-357b4925882b>), in state <Peer in Cluster>, has disconnected from glusterd.
01:32 hackman it should be possible to fix this with gai.conf
01:32 hackman :)
01:32 hackman I want to first exhaust that option :)
01:43 kovshenin joined #gluster
01:45 ira joined #gluster
01:48 ovaistariq joined #gluster
01:52 haomaiwa_ joined #gluster
01:53 baojg joined #gluster
01:58 5EXAAE7LO joined #gluster
01:59 5EXAAE7LS joined #gluster
02:00 haomaiwa_ joined #gluster
02:00 djgerm left #gluster
02:01 haomaiwang joined #gluster
02:02 haomaiwang joined #gluster
02:03 haomaiwang joined #gluster
02:04 haomaiwa_ joined #gluster
02:05 haomaiwang joined #gluster
02:06 haomaiwa_ joined #gluster
02:07 14WAAF0EB joined #gluster
02:08 haomaiwa_ joined #gluster
02:09 raghu joined #gluster
02:09 14WAAF0EZ joined #gluster
02:10 haomaiwang joined #gluster
02:11 7YUAAYBCY joined #gluster
02:12 haomaiwang joined #gluster
02:13 14WAAF0F3 joined #gluster
02:14 haomaiwang joined #gluster
02:15 haomaiwa_ joined #gluster
02:16 20WAAECTY joined #gluster
02:17 haomaiwa_ joined #gluster
02:26 farhoriz_ joined #gluster
02:38 hackman I believe that this is the fix I'm searching for: http://review.gluster.org/#/c/11988/
02:38 glusterbot Title: Gerrit Code Review (at review.gluster.org)
02:38 hackman I'll try it in the morning :)
02:48 ilbot3 joined #gluster
02:48 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
02:54 atrius joined #gluster
02:57 sakshi joined #gluster
03:01 haomaiwa_ joined #gluster
03:08 harish_ joined #gluster
03:10 unlaudable joined #gluster
03:17 Wizek_ joined #gluster
03:18 baojg joined #gluster
03:18 nishanth joined #gluster
03:25 skoduri joined #gluster
03:32 coredump joined #gluster
03:42 anoopcs hackman, You are right. http://review.gluster.org/#/c/11988/ contains the bug fixes for IPv6.
03:42 glusterbot Title: Gerrit Code Review (at review.gluster.org)
03:43 rcampbel3 joined #gluster
03:43 bharata-rao joined #gluster
03:45 RameshN joined #gluster
03:54 coredump joined #gluster
03:59 JoeJulian anoopcs++
03:59 glusterbot JoeJulian: anoopcs's karma is now 1
03:59 ira joined #gluster
04:00 vimal joined #gluster
04:01 haomaiwang joined #gluster
04:03 bharata-rao left #gluster
04:05 atinm joined #gluster
04:07 itisravi joined #gluster
04:10 rafi joined #gluster
04:10 ramteid joined #gluster
04:17 farhoriz_ joined #gluster
04:19 shubhendu joined #gluster
04:20 kanagaraj joined #gluster
04:27 ppai joined #gluster
04:27 rcampbel3 joined #gluster
04:29 baojg joined #gluster
04:35 ahino joined #gluster
04:35 nbalacha joined #gluster
04:36 ueberall joined #gluster
04:36 ueberall joined #gluster
04:37 coredump joined #gluster
04:37 vimal joined #gluster
04:39 ramky joined #gluster
04:42 shubhendu joined #gluster
04:45 calavera joined #gluster
04:47 gem joined #gluster
04:48 gem_ joined #gluster
04:49 gowtham joined #gluster
04:52 BuffaloCN joined #gluster
04:55 pppp joined #gluster
04:56 ndarshan joined #gluster
04:57 farhoriz_ joined #gluster
05:01 haomaiwa_ joined #gluster
05:03 Wizek__ joined #gluster
05:03 poornimag joined #gluster
05:07 BuffaloCN joined #gluster
05:15 kshlm joined #gluster
05:18 Ramereth joined #gluster
05:22 Wizek joined #gluster
05:24 jiffin joined #gluster
05:25 aravindavk joined #gluster
05:27 nehar joined #gluster
05:29 gowtham joined #gluster
05:29 kotreshhr joined #gluster
05:31 Manikandan joined #gluster
05:31 hgichon joined #gluster
05:31 atrius joined #gluster
05:32 kasturi joined #gluster
05:37 ovaistariq joined #gluster
05:38 JoeJulian hgichon: No, there's no options to speed up fuse performance on large files.
05:38 JoeJulian The defaults should be able to max out your network.
05:38 JoeJulian Why, what are you seeing?
05:38 ovaistar_ joined #gluster
05:39 JoeJulian 1.4GB on a distributed volume is surprisingly slow.
05:40 JoeJulian I assume this is low latency?
05:40 hgichon Yes. Rdma transport
05:41 Apeksha joined #gluster
05:41 JoeJulian Ah, check your IB drivers. I've heard reports of that with certain brands and drivers.
05:41 JoeJulian (Can't remember which ones off the top of my head)
05:43 hgichon Is it helpful to test on 3.18 kernel ... fuse write cache added kernel.
05:44 Bhaskarakiran joined #gluster
05:46 hgichon 1.4GB is single dd write result. 2GB/s is max result with Multiple dd
05:46 _Bryan_ joined #gluster
05:49 hgowtham joined #gluster
05:52 karthikfff joined #gluster
05:53 JoeJulian I don't know. I'm on 4.4.1, myself, but no IB toys for me.
06:00 alghost joined #gluster
06:01 hgichon Thanks joe... I will share test results.
06:01 haomaiwa_ joined #gluster
06:01 atalur joined #gluster
06:01 overclk joined #gluster
06:02 anoopcs hgichon, I would also interested to see those test results.
06:02 anoopcs s/also/also be
06:03 anoopcs hgichon, btw, what was your volume configuration?
06:05 hgichon 56g ib and distributed 4 bricks
06:06 shubhendu joined #gluster
06:06 hgichon 7 ssd with raid0
06:06 ashiq joined #gluster
06:13 nbalacha joined #gluster
06:13 anoopcs hgichon, Can you please try changing the transport type of volume to tcp and execute your tests? Just to make sure the gain we have with rdma.
06:13 anoopcs hgichon, follow this link: https://gluster.readthedocs.org/en/latest/Administrator%20Guide/RDMA%20Transport/#changing-transport-of-volume
06:13 glusterbot Title: RDMA Transport - Gluster Docs (at gluster.readthedocs.org)
06:13 hgowtham joined #gluster
06:14 sripathi joined #gluster
06:16 hgowtham joined #gluster
06:16 karnan joined #gluster
06:24 ekuric joined #gluster
06:28 hgichon With tcp 1GB/s in 1dd ... 13GB/s in 2 dd
06:29 nbalacha joined #gluster
06:30 atrius joined #gluster
06:43 anil joined #gluster
06:49 RameshN joined #gluster
06:58 atalur joined #gluster
07:01 farhori__ joined #gluster
07:01 7YUAAYC5A joined #gluster
07:11 anil joined #gluster
07:17 atrius joined #gluster
07:19 haomaiwa_ joined #gluster
07:23 mhulsman joined #gluster
07:26 jtux joined #gluster
07:26 anoopcs hgichon, 13 GB/s?
07:30 julim joined #gluster
07:33 post-factum joined #gluster
07:35 TonyBurn joined #gluster
07:41 [diablo] joined #gluster
07:42 hgichon Oh sorry. 1.3GB
07:49 cuqa joined #gluster
07:49 cuqa joined #gluster
07:51 ira joined #gluster
08:01 haomaiwa_ joined #gluster
08:02 arcolife joined #gluster
08:09 ivan_rossi joined #gluster
08:13 ilbot3 joined #gluster
08:13 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
08:17 [Enrico] joined #gluster
08:25 atrius joined #gluster
08:28 abyss_ joined #gluster
08:28 jri joined #gluster
08:29 cpetersen joined #gluster
08:29 nishanth joined #gluster
08:30 Akee joined #gluster
08:31 cpetersen joined #gluster
08:32 sripathi1 joined #gluster
08:39 rafi1 joined #gluster
08:39 atrius joined #gluster
08:42 fsimonce joined #gluster
08:46 gem joined #gluster
08:47 Akee joined #gluster
08:50 ahino joined #gluster
08:50 ctria joined #gluster
08:51 Simmo joined #gluster
08:52 anoopcs hgichon, Ok. So we have better performance with RDMA over TCP.
08:55 nbalacha joined #gluster
08:58 anti[Enrico] joined #gluster
09:01 haomaiwa_ joined #gluster
09:04 harish_ joined #gluster
09:08 kdhananjay joined #gluster
09:10 atrius joined #gluster
09:13 anoopcs hgichon, Check your IB device latency with qperf and see whether it matches with the results that you have.
09:15 R0ok_ joined #gluster
09:15 hgowtham joined #gluster
09:17 ashiq joined #gluster
09:21 atrius joined #gluster
09:29 drankis joined #gluster
09:29 manous joined #gluster
09:29 haomaiw__ joined #gluster
09:31 jiffin1 joined #gluster
09:31 rafi joined #gluster
09:32 Saravanakmr joined #gluster
09:33 haomaiwa_ joined #gluster
09:34 atrius joined #gluster
09:34 yoavz joined #gluster
09:37 bio_ on 2 concurrent volume rebalance runs, why do i get the same output in the status output. the same number of files being shown as having been rebalanced, as if i did not rebalance them in the first run
09:41 Slashman joined #gluster
09:41 deniszh joined #gluster
09:47 rafi1 joined #gluster
09:48 nbalacha joined #gluster
09:49 hgowtham joined #gluster
09:51 tdasilva joined #gluster
09:52 robb_nl joined #gluster
09:52 fale joined #gluster
09:52 mmckeen joined #gluster
09:52 Chinorro joined #gluster
09:52 and` joined #gluster
09:52 d4n13L joined #gluster
09:52 s-hell joined #gluster
09:52 lupine joined #gluster
09:52 tru_tru joined #gluster
09:52 unforgiven512 joined #gluster
09:52 rjoseph joined #gluster
09:52 cvstealth joined #gluster
09:52 rastar joined #gluster
09:52 Champi joined #gluster
09:52 partner joined #gluster
09:53 xMopxShell joined #gluster
09:53 pocketprotector joined #gluster
09:53 dmnchild joined #gluster
09:53 Trefex joined #gluster
09:53 mlhess joined #gluster
09:54 morse joined #gluster
09:54 kalzz joined #gluster
09:56 Ulrar joined #gluster
09:56 _fortis joined #gluster
09:56 Arrfab joined #gluster
09:57 owlbot joined #gluster
09:57 kblin joined #gluster
09:58 swebb joined #gluster
09:58 sadbox joined #gluster
09:59 sghatty_ joined #gluster
10:00 Chr1st1an joined #gluster
10:00 lezo joined #gluster
10:01 haomaiwang joined #gluster
10:05 ilbot3 joined #gluster
10:05 Topic for #gluster is now Gluster Community - http://gluster.org | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
10:05 Chinorro joined #gluster
10:05 and` joined #gluster
10:06 Arrfab joined #gluster
10:06 sc0 joined #gluster
10:06 Pintomatic joined #gluster
10:07 tdasilva joined #gluster
10:08 dataio joined #gluster
10:08 [Enrico] joined #gluster
10:08 overclk joined #gluster
10:08 tru_tru joined #gluster
10:08 rossdm joined #gluster
10:08 ccha2 joined #gluster
10:08 tom[] joined #gluster
10:08 p8952 joined #gluster
10:08 wushudoin| joined #gluster
10:08 mbukatov joined #gluster
10:08 ron-slc joined #gluster
10:08 ackjewt joined #gluster
10:08 R0ok_ joined #gluster
10:08 jri joined #gluster
10:09 stopbyte joined #gluster
10:10 Slashman joined #gluster
10:11 hackman joined #gluster
10:12 gem joined #gluster
10:12 sloop joined #gluster
10:12 ws2k3_ joined #gluster
10:13 twisted` joined #gluster
10:14 fyxim joined #gluster
10:20 petan joined #gluster
10:27 mhulsman joined #gluster
10:29 atrius joined #gluster
10:30 ovaistariq joined #gluster
10:30 suliba joined #gluster
10:39 atrius joined #gluster
10:47 ashiq joined #gluster
10:48 matclayton joined #gluster
10:49 abyss_ joined #gluster
10:49 Larsen_ joined #gluster
10:49 lapy joined #gluster
10:49 lapy joined #gluster
10:49 eljrax joined #gluster
10:49 dmnchild joined #gluster
10:49 burn joined #gluster
10:49 ccha2 joined #gluster
10:49 Bardack joined #gluster
10:49 karnan joined #gluster
10:49 mrEriksson joined #gluster
10:49 Saravanakmr joined #gluster
10:49 sakshi joined #gluster
10:49 ueberall joined #gluster
10:49 ueberall joined #gluster
10:49 ivan_rossi joined #gluster
10:49 crashmag joined #gluster
10:50 cuqa_ joined #gluster
10:50 XpineX joined #gluster
10:50 shaunm joined #gluster
10:50 Champi joined #gluster
10:50 tg2 joined #gluster
10:50 xMopxShell joined #gluster
10:50 NuxRo joined #gluster
10:50 matclayton Hey guys, we’re currently bumping from 3.6.x to 3.7.8 and rolling out a new filesystem using RF3 with an arbitor. The 3.6 gluster clients are only showing the volume size in df as a multiple of the arbitors, is this likely to be a problem in practice or is it just a df issue which we could live with.
10:50 ndk joined #gluster
10:50 yawkat joined #gluster
10:51 liewegas joined #gluster
10:51 ndevos itisravi: maybe you can help matclayton with his arbiter question?
10:51 jockek joined #gluster
10:52 JoeJulian joined #gluster
10:52 ashka joined #gluster
10:52 sadbox joined #gluster
10:52 Nebraskka joined #gluster
10:52 klaxa joined #gluster
10:52 dron23 joined #gluster
10:52 Ramereth joined #gluster
10:53 xMopxShell joined #gluster
10:53 atrius joined #gluster
10:53 moss joined #gluster
10:53 shubhendu joined #gluster
10:53 Reiner031 joined #gluster
10:53 sagarhani joined #gluster
10:53 dries joined #gluster
10:53 kalzz joined #gluster
10:54 msvbhat joined #gluster
10:54 bio_ joined #gluster
10:55 virusuy joined #gluster
10:55 k-ma joined #gluster
10:55 v12aml joined #gluster
10:56 Chr1st1an joined #gluster
10:58 scubacuda joined #gluster
10:58 Pintomatic joined #gluster
10:59 lezo joined #gluster
11:00 ws2k3_ joined #gluster
11:00 twisted` joined #gluster
11:01 haomaiwa_ joined #gluster
11:01 sghatty_ joined #gluster
11:01 luizcpg joined #gluster
11:02 sjohnsen joined #gluster
11:03 Leildin joined #gluster
11:04 itisravi matclayton: Is the arbiter brick itself storing data in the files or the files still zero byte files?
11:04 jockek joined #gluster
11:04 atrius joined #gluster
11:04 ndevos joined #gluster
11:04 ndevos joined #gluster
11:04 dmnchild joined #gluster
11:04 janegil joined #gluster
11:04 kblin joined #gluster
11:04 kblin joined #gluster
11:04 rossdm joined #gluster
11:04 rossdm joined #gluster
11:04 nbalacha joined #gluster
11:04 oliva joined #gluster
11:04 csaba joined #gluster
11:04 atinm joined #gluster
11:04 kanagaraj joined #gluster
11:04 harish_ joined #gluster
11:04 devilspgd joined #gluster
11:04 bfoster joined #gluster
11:04 mmckeen joined #gluster
11:04 Manikandan joined #gluster
11:04 ahino joined #gluster
11:04 tru_tru joined #gluster
11:04 Bhaskarakiran joined #gluster
11:05 monotek joined #gluster
11:05 yawkat joined #gluster
11:05 crashmag joined #gluster
11:05 Peppard joined #gluster
11:05 wolsen joined #gluster
11:05 lanning joined #gluster
11:05 mhulsman joined #gluster
11:05 unforgiven512 joined #gluster
11:05 matclayton joined #gluster
11:06 owlbot joined #gluster
11:06 burn joined #gluster
11:06 al joined #gluster
11:06 matclayton itisravi: not sure what you mean, we’ve not written any files to the system yet
11:07 anoopcs joined #gluster
11:07 itisravi matclayton: ah then it is likely a bug that was fixed with this patch http://review.gluster.org/11857
11:07 glusterbot Title: Gerrit Code Review (at review.gluster.org)
11:07 kbyrne joined #gluster
11:08 gildub joined #gluster
11:08 Leildin hey guys, I'm having to change my network around and have a gluster to migrate and change IP. is there an easy way to do it or is it all modifying config files and bricks files ?
11:08 matclayton Erm, we’re not really seeing a but, but rather I want to know if a 3.6 client will work with a 3.7 server running arbitors
11:08 matclayton *see a bug
11:08 matclayton right now, it appears to connect, but the 3.6 clients are showing incorrect volume sizes in df
11:08 fyxim joined #gluster
11:09 itisravi matclayton: I'd recommend upgrading the clients to 3.7 too
11:10 matclayton itisravi: we have two systems (old and new) and doing a live migration between them, the old ones run ubuntu 12.04 and glusterfs 3.6.3 and new ones run 14.04 and glusterfs 3.7.8
11:10 sagarhani joined #gluster
11:10 hackman is someone working on: http://review.gluster.org/#/c/11988/
11:10 glusterbot Title: Gerrit Code Review (at review.gluster.org)
11:10 amye joined #gluster
11:10 matclayton so trying to find a client which could mount both and work during the transition
11:11 _nixpanic joined #gluster
11:11 matclayton we’d bump to 3.7 clients once complete and shutdown all the 3.6 system. The alternative solution is to just run NFS clients during the migration period which is looking like the best solution right now...
11:11 _nixpanic joined #gluster
11:12 matclayton itisravi: would you advise using nfs instead of mixing up 3.6 / 3.7 for a period of time?
11:13 itisravi matclayton: yes I would prefer that
11:14 _fortis joined #gluster
11:14 itisravi matclayton: so If I understand what you are doing, you're going to mount the 3.7 volume using NFS and then copy everything from the 3.6 volume into the 3.7 one right?
11:14 billputer joined #gluster
11:15 Wizek joined #gluster
11:15 owlbot joined #gluster
11:15 matclayton we’d likely do it the other way, and mount the 3.6 system with NFS, and 3.7 with gluster client, then copy it all over
11:15 Chr1st1an joined #gluster
11:15 frankS2 joined #gluster
11:15 sc0 joined #gluster
11:15 matclayton itisravi: but we could do either, if there is advantage doing it one way
11:16 matclayton itisravi: the deployment is a website, so we control all the clients and servers thankfully
11:17 matclayton the datamigration is likely to take 3-4 months though, so I just need something which is stable throughout
11:18 itisravi matclayton: The gluster stack (and speficially the replication translator) is loaded both on FUSE mounts and NFS mounts. Since your clients are still 3.6.3, if you use a fuse mount, the client stack is still 3.6.3
11:18 itisravi matclayton: But if you use gluster NFS mount, since the gluster NFS process is on the server, it would use 3.7.8.
11:18 lh_ joined #gluster
11:18 itisravi which is why NFS might be a better option.
11:19 matclayton makes sense, so we want the clients on either NFS or 3.7.8
11:19 owlbot joined #gluster
11:19 itisravi matclayton: right
11:20 matclayton itisravi: I think we archive this either way, we’d do server (3.6) with NFS + client/server with 3.7.8, or server (3.7.8) with NFS + client/server 3.6.3
11:20 matclayton *achieve
11:21 owlbot joined #gluster
11:22 itisravi matclayton: sorry I am a bit confused. Are you upgrading your existing volume to 3.7.8 or creating a new volume based off 3.7.8 and copying data into that from your older 3.6.3 based volume?
11:23 matclayton itisravi: sorry my fault, we have a 3.6 system in place already, and have a brand new 3.7.8 cluster
11:23 matclayton all the data right now is in 3.6 on old hardware we are planning to shutdown
11:24 overclk joined #gluster
11:24 matclayton so we are migrating from 3.6 to 3.7.8 and they are running on new machines
11:25 itisravi matclayton: right, so you just mount the 3.7.8 volume on any of your clients using NFS mount. Then copy the data from the 3.6 volume (which I presume is also mounted on the same client).
11:25 matclayton itisravi: thats the plan
11:25 matclayton itisravi: the issue we had was neither gluster 3.6 or 3.7 client seemed to work on both backends
11:25 bio_ another try: on 2 concurrent volume rebalance runs, why do i get the same output in the status output. the same number of files being shown as having been rebalanced, as if it did not rebalance them in the first run. any ideas?
11:26 matclayton itisravi: but switching to NFS seems like a sane solution
11:26 itisravi matclayton: yeah
11:27 matclayton itisravi: we were hesitent for a while as these machines are pulling 5-10Gbit/sec through them, and switch to NFS would add a new hop
11:28 itisravi matclayton: yeah that is true.
11:29 matclayton itisravi: we have 20Gbit per machine so we should be fine, but I was hoping to avoid the increased load if possible
11:30 itisravi matclayton: just wondering, what if you FUSE mount the 3.7.8 volume on one of the servers itself and then scp stuff from the older client into this mount?
11:31 matclayton we could do that, and have done before, but its painful
11:31 itisravi mhm ok
11:31 gem joined #gluster
11:31 matclayton itisravi: we’re pushing 500TB+ of data over the link, and SCP fails a lot :(
11:32 itisravi matclayton: ah is rsync any better?
11:32 matclayton itisravi: it wasn’t, the issue with rsync is the number of files causes it to have headaches
11:32 matclayton it isn’t very smart about the diff
11:32 itisravi okay
11:33 shubhendu joined #gluster
11:33 matclayton the other problem I’ve not really mentioned is to stay online during this, we need some clients to mount both to server the data out
11:33 matclayton the current plan is to have the new machines also mount the old volume (with nfs) and using nginx to try one then the other
11:34 itisravi right
11:35 itisravi matclayton: btw, how are you liking the behaviour of arbiter volumes on your 3.6.3 setup?
11:35 kotreshhr left #gluster
11:35 matclayton as in the 3.6.3 clients?
11:35 ccha2 joined #gluster
11:36 itisravi matclayton: oh wait, I guess this is your first go at using arbiter volumes?
11:36 matclayton itisravi: yeah we didn’t have them on 3.6, correct?
11:37 itisravi matclayton: It was there but a lot of fixes have gone in since then.
11:37 matclayton ah wasn’t aware
11:37 matclayton yeah this is our first attempt, love the concept
11:37 matclayton also very keen to get snapshoting live as well
11:37 matclayton and in the long run tiers
11:38 matclayton the one feature I would absolutely love to see is geo replication to S3
11:38 karnan joined #gluster
11:39 matclayton specifically to glacier and other S3 bucket types to keep a backup around which run an independant stack and can be isolated off with API permissions
11:39 itisravi matclayton: If you find bugs or need help on arbiter volumes, just send a mail on gluster-users. myself, atalur and kdhananjay can help you out if needed.
11:39 matclayton itisravi: cool will do thanks
11:40 Nebraskka joined #gluster
11:40 ccha2 joined #gluster
11:40 Logos01 joined #gluster
11:40 ashka joined #gluster
11:40 atalur joined #gluster
11:40 ChrisHolcombe joined #gluster
11:40 kbyrne joined #gluster
11:40 burn joined #gluster
11:40 hackman joined #gluster
11:40 gowtham joined #gluster
11:40 Norky joined #gluster
11:40 m0zes joined #gluster
11:41 bfoster joined #gluster
11:41 atrius joined #gluster
11:42 HamburgerMartyr_ joined #gluster
11:42 kdhananjay joined #gluster
11:42 klaxa joined #gluster
11:42 Ethical2ak joined #gluster
11:43 samikshan joined #gluster
11:43 owlbot joined #gluster
11:43 mzink_gone joined #gluster
11:44 atinm joined #gluster
11:45 lanning joined #gluster
11:46 Bhaskarakiran joined #gluster
11:47 itisravi matclayton: I was wrong about arbiter being present in 3.6..It was added in 3.7
11:48 itisravi you were correct.
11:48 matclayton itisravi: ah, thanks, thought it might have been hidden :)
11:48 itisravi :)
11:48 matclayton btw whats the expected perf hit going from RF2 to RF3 (with arbitor?)
11:48 matclayton which would expect us to see a shift?
11:49 itisravi arbiter volumes take a full file lock as opposed to range locks in normal RF2..so for multiple writers, you would see a little drop
11:50 matclayton on a single file?
11:50 matclayton or the volume
11:50 [Enrico] joined #gluster
11:50 itisravi on a single file
11:50 samppah joined #gluster
11:50 uebera|| joined #gluster
11:50 paratai joined #gluster
11:50 n-st joined #gluster
11:50 mpingu joined #gluster
11:50 mowntan joined #gluster
11:50 PaulePanter joined #gluster
11:50 sac joined #gluster
11:50 tom[] joined #gluster
11:50 sac joined #gluster
11:50 p8952 joined #gluster
11:50 Bhaskarakiran joined #gluster
11:50 misc joined #gluster
11:50 natarej joined #gluster
11:50 The_Ball joined #gluster
11:50 EinstCrazy joined #gluster
11:50 cliluw joined #gluster
11:50 matclayton thats fine then, we run in worm anyway
11:50 jtux joined #gluster
11:50 timotheus1_ joined #gluster
11:50 pdrakeweb joined #gluster
11:51 julim joined #gluster
11:51 hgowtham joined #gluster
11:51 dastar joined #gluster
11:51 kanagaraj joined #gluster
11:51 Champi joined #gluster
11:51 Vaelatern joined #gluster
11:51 itisravi cool
11:51 CP|AFK joined #gluster
11:51 om joined #gluster
11:52 mattmcc joined #gluster
11:52 jiffin joined #gluster
11:52 arcolife joined #gluster
11:53 volga629 joined #gluster
11:53 mdavidson joined #gluster
11:53 honzik666 joined #gluster
11:53 glusterbot joined #gluster
11:53 kshlm The weekly Gluster community meeting will start in ~5 minutes at #gluster-meeting. Today's agenda is at https://public.pad.fsfe.org/p/gluster-community-meetings
11:53 glusterbot Title: FSFE Etherpad: public instance (at public.pad.fsfe.org)
11:54 muneerse joined #gluster
11:55 [Enrico] joined #gluster
11:59 purpleidea joined #gluster
11:59 purpleidea joined #gluster
11:59 jiffin joined #gluster
11:59 cyberbootje joined #gluster
11:59 cpetersen joined #gluster
11:59 jiffin joined #gluster
11:59 necrogami joined #gluster
11:59 aravindavk joined #gluster
11:59 aravindavk joined #gluster
11:59 bfoster joined #gluster
11:59 wushudoin| joined #gluster
11:59 foster joined #gluster
11:59 codex joined #gluster
11:59 jbrooks joined #gluster
11:59 baoboa joined #gluster
11:59 rastar joined #gluster
11:59 ironhalik joined #gluster
11:59 lalatenduM joined #gluster
11:59 drankis joined #gluster
12:00 xavih joined #gluster
12:00 gem joined #gluster
12:00 kshlm joined #gluster
12:00 sloop joined #gluster
12:00 malevolent joined #gluster
12:00 edualbus joined #gluster
12:00 haomaiwa_ joined #gluster
12:00 ctria joined #gluster
12:00 necrogami joined #gluster
12:00 pppp joined #gluster
12:00 dthrvr joined #gluster
12:01 kmmndr joined #gluster
12:01 wolsen joined #gluster
12:01 fgd joined #gluster
12:02 Vaizki joined #gluster
12:02 hgichon joined #gluster
12:02 jeek joined #gluster
12:02 [o__o] joined #gluster
12:03 shortdudey123 joined #gluster
12:04 XpineX joined #gluster
12:04 tswartz joined #gluster
12:04 nehar joined #gluster
12:04 cvstealth joined #gluster
12:04 Intensity joined #gluster
12:04 7YUAAAAHV joined #gluster
12:04 ccoffey joined #gluster
12:04 nehar joined #gluster
12:04 _fortis joined #gluster
12:04 samsaffron___ joined #gluster
12:04 samsaffron___ joined #gluster
12:04 cristian joined #gluster
12:04 eryc joined #gluster
12:04 eryc joined #gluster
12:04 Intensity joined #gluster
12:04 suliba joined #gluster
12:04 social joined #gluster
12:04 yalu joined #gluster
12:04 kevc joined #gluster
12:05 dries joined #gluster
12:07 The_Ball joined #gluster
12:07 semiosis joined #gluster
12:09 primusinterpares joined #gluster
12:10 Telsin joined #gluster
12:16 Saravanakmr joined #gluster
12:19 ovaistariq joined #gluster
12:21 gowtham joined #gluster
12:25 saltsa joined #gluster
12:25 zoldar joined #gluster
12:25 DJClean joined #gluster
12:25 DJClean joined #gluster
12:25 R0ok_ joined #gluster
12:25 burn joined #gluster
12:25 Peppard joined #gluster
12:25 TonyBurn joined #gluster
12:25 sankarshan joined #gluster
12:25 karthikfff joined #gluster
12:25 karnan joined #gluster
12:25 xMopxShell joined #gluster
12:25 skoduri joined #gluster
12:25 siel joined #gluster
12:25 DV joined #gluster
12:25 vimal joined #gluster
12:25 kkeithley joined #gluster
12:25 luizcpg joined #gluster
12:25 anil joined #gluster
12:25 muneerse joined #gluster
12:25 mlhess joined #gluster
12:25 poornimag joined #gluster
12:25 stopbyte joined #gluster
12:25 karthikfff joined #gluster
12:25 karnan joined #gluster
12:25 sadbox joined #gluster
12:25 vimal joined #gluster
12:25 anil joined #gluster
12:25 kkeithley joined #gluster
12:25 coredump joined #gluster
12:25 mbukatov joined #gluster
12:25 poornimag joined #gluster
12:25 mbukatov joined #gluster
12:26 portante joined #gluster
12:30 RameshN joined #gluster
12:30 JonathanD joined #gluster
12:31 farhoriz_ joined #gluster
12:32 cogsu joined #gluster
12:32 martinet1 joined #gluster
12:32 JPau1 joined #gluster
12:32 rideh joined #gluster
12:32 fale_ joined #gluster
12:32 d4n13L_ joined #gluster
12:32 dmnchild1 joined #gluster
12:32 Nakiri__ joined #gluster
12:32 jotun_ joined #gluster
12:32 oliva joined #gluster
12:32 troj joined #gluster
12:32 wiza joined #gluster
12:32 Wizek joined #gluster
12:32 tg2 joined #gluster
12:32 post-factum joined #gluster
12:32 dthrvr joined #gluster
12:32 lupine joined #gluster
12:32 ashiq joined #gluster
12:32 ashiq joined #gluster
12:32 Humble joined #gluster
12:32 jwang__ joined #gluster
12:32 Bardack joined #gluster
12:32 argonius joined #gluster
12:32 harish_ joined #gluster
12:32 ghenry joined #gluster
12:32 ghenry joined #gluster
12:32 eljrax joined #gluster
12:32 Slashman joined #gluster
12:33 Trefex joined #gluster
12:33 shruti joined #gluster
12:33 Akee joined #gluster
12:33 kblin joined #gluster
12:33 rjoseph joined #gluster
12:33 liewegas joined #gluster
12:33 Dave joined #gluster
12:33 edong23 joined #gluster
12:33 s-hell joined #gluster
12:33 bitpushr joined #gluster
12:33 kblin joined #gluster
12:34 rossdm joined #gluster
12:34 jvandewege joined #gluster
12:34 atrius` joined #gluster
12:34 anoopcs joined #gluster
12:34 lord4163 joined #gluster
12:35 jri joined #gluster
12:35 yosafbridge joined #gluster
12:36 nhayashi joined #gluster
12:37 pocketprotector joined #gluster
12:37 bhuddah joined #gluster
12:38 jermudgeon joined #gluster
12:39 nehar joined #gluster
12:40 _nixpanic joined #gluster
12:40 sagarhani joined #gluster
12:40 atrius joined #gluster
12:40 _nixpanic joined #gluster
12:40 baojg joined #gluster
12:41 valkyr1e joined #gluster
12:43 atalur joined #gluster
12:43 frankS2 joined #gluster
12:46 ndk joined #gluster
12:46 Chr1st1an joined #gluster
12:46 RameshN joined #gluster
12:48 csaba joined #gluster
12:51 sc0 joined #gluster
12:51 Pintomatic joined #gluster
12:52 kanagaraj joined #gluster
12:52 johnmilton joined #gluster
12:54 Lee1092 joined #gluster
12:55 atrius joined #gluster
12:56 luizcpg joined #gluster
12:56 robb_nl joined #gluster
12:58 nishanth joined #gluster
13:01 kanagaraj joined #gluster
13:05 Manikandan joined #gluster
13:10 kdhananjay1 joined #gluster
13:13 shubhendu joined #gluster
13:13 kdhananjay joined #gluster
13:16 karnan joined #gluster
13:18 poornimag joined #gluster
13:19 ira joined #gluster
13:23 ahino joined #gluster
13:23 atrius joined #gluster
13:33 atrius joined #gluster
13:38 [Enrico] joined #gluster
13:47 jiffin1 joined #gluster
13:48 atrius joined #gluster
13:50 javi404 joined #gluster
13:54 jiffin joined #gluster
13:58 raghu joined #gluster
13:59 masterzen joined #gluster
13:59 Dasiel joined #gluster
13:59 Slashman joined #gluster
14:00 atrius joined #gluster
14:03 fyxim joined #gluster
14:04 vimal left #gluster
14:06 samsaffron___ joined #gluster
14:06 ovaistariq joined #gluster
14:10 billputer joined #gluster
14:14 nbalacha joined #gluster
14:19 haomaiwa_ joined #gluster
14:19 atrius joined #gluster
14:20 ueberall_g joined #gluster
14:21 lh_ joined #gluster
14:25 jwd joined #gluster
14:26 fgd I'm trying to replace a brick from a failed node, but the replace-brick fails: http://pastebin.com/YJhfXLp7 Any ideas?
14:26 glusterbot Please use http://fpaste.org or http://paste.ubuntu.com/ . pb has too many ads. Say @paste in channel for info about paste utils.
14:26 karnan joined #gluster
14:26 kdhananjay1 joined #gluster
14:26 shubhendu joined #gluster
14:29 atrius joined #gluster
14:32 Nakiri__ Hello
14:32 glusterbot Nakiri__: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
14:32 Nakiri__ I have glusterfs installed on two servers with replication going on between them. Everything works fine until I shut down one of the peers.
14:32 Nakiri__ Then the other machine goes basically non-responsive for about a minute.
14:32 Nakiri__ Waiting for some kind of timeout? Even ls waits for it.
14:32 Nakiri__ Yet I've done "gluster volume set fs network.ping-timeout "5"" which changed nothing.
14:33 lupine joined #gluster
14:33 Nakiri__ I'd need that time to be under 10 seconds. Would adding an arbiter node speed it up?
14:34 honzik666 A related question, is it possible to add an arbiter node to an already existing volume? The documentation shows arbiter setup along with a new volume creation only
14:36 kdhananjay joined #gluster
14:36 d4n13L Nakiri__ : likely because it's only one server left, so it can't really get a quorum. you should always have an odd number of peers
14:36 d4n13L never ran into that situation myself with only one server left, so not sure what the behaviour actually is
14:36 Nakiri__ So adding an arbiter should work?
14:37 d4n13L so having an arbiter node on a third host might help
14:37 Nakiri__ Alright, thanks a lot!
14:41 honzik666 I have tested running with only 1 node left and it would continue to work normally, however, occassionally, if the second node didn't die completely I would face splitbrain. So now I am planning to add arbiter as well. Is it possible to do this to an existing volume?
14:42 frankS2 joined #gluster
14:44 Saravanakmr joined #gluster
14:44 d4n13L dont see why not, just add the brick
14:44 hamiller joined #gluster
14:45 kdhananjay left #gluster
14:46 honzik666 d4n13L: ok, what I mean is an arbiter that won't hold the actual data, only meta data and directory entries: https://gluster.readthedocs.org/en/release-3.7.0/Features/afr-arbiter-volumes/
14:46 glusterbot Title: AFR Arbiter Volumes - Gluster Docs (at gluster.readthedocs.org)
14:47 honzik666 can this be added ex-post?
14:48 nbalacha joined #gluster
14:48 aravindavk joined #gluster
14:50 Nakiri__ Okay, interestingly enough - I fixed my problem.
14:50 Nakiri__ Without a third peer.
14:50 plarsen joined #gluster
14:50 Nakiri__ I just mounted the filesystem as nfs not glusterfs.
14:51 Nakiri__ And it just.. doesn't wait for any timeouts now.
14:51 d4n13L but nfs doesnt switch over in case of a failure
14:51 d4n13L its always connected to one node
14:51 d4n13L except you're having some kind of load-balancing IP
14:51 d4n13L honzik666 : not sure, only thing I found about it: https://www.gluster.org/pipermail/gluster-users/2015-August/023030.html
14:52 glusterbot Title: [Gluster-users] Add-brick and arbiter volume (at www.gluster.org)
14:53 honzik666 d4n13L: yup, this is pretty much what I have found out
14:53 anoopcs amye++
14:53 glusterbot anoopcs: amye's karma is now 3
14:54 amye Yay roadmap ++
14:55 julim joined #gluster
14:55 liewegas joined #gluster
14:55 Nakiri__ The machines are load balanced themselves.
14:55 Nakiri__ The clients are both servers aswell.
14:56 B21956 joined #gluster
14:57 d4n13L so you pretty much mount localhost on each?
14:58 bowhunter joined #gluster
14:58 Nakiri__ Yes.
14:59 Nakiri__ Ultimately what I needed was to have a file system synced on two machines without anything too complicated.
14:59 d4n13L well, only two machines just use DRBD
14:59 lupine weeelllllll
14:59 shubhendu joined #gluster
15:01 Nakiri__ This is for now though, a huge system might grow out of this.
15:01 haomaiwa_ joined #gluster
15:02 chirino_m joined #gluster
15:02 rafi1 joined #gluster
15:02 fgd Hi #gluster! I've asked this a few minutes a go, having another go. I'm trying to replace a brick from a failed node, but the replace-brick fails: https://dpaste.de/3cbe Any ideas?
15:02 glusterbot Title: dpaste.de: Snippet #352672 (at dpaste.de)
15:05 ctria joined #gluster
15:06 coredump joined #gluster
15:06 robb_nl joined #gluster
15:11 nbalacha joined #gluster
15:17 atinm joined #gluster
15:17 karnan joined #gluster
15:20 hagarth joined #gluster
15:20 pppp joined #gluster
15:25 JoeJulian fgd: check peer status
15:26 * JoeJulian shudders at d4n13L's suggestion of drbd.
15:26 d4n13L :D
15:27 d4n13L worked fine for us for years :)
15:27 d4n13L on gluster now though ;)
15:27 JoeJulian It trashed my data three times in as many months.
15:29 rafi joined #gluster
15:31 NuxRo joined #gluster
15:32 papamoose joined #gluster
15:33 kshlm joined #gluster
15:41 mowntan joined #gluster
15:42 skylar joined #gluster
15:44 raghu joined #gluster
15:51 gem joined #gluster
15:54 bennyturns joined #gluster
15:55 ovaistariq joined #gluster
16:01 haomaiwa_ joined #gluster
16:04 CyrilPeponnet joined #gluster
16:10 cpetersen JoeJulian: I am about to do a hard shutdown test of one of my nodes in my cluster.  Wish me luck and no split-brain.
16:11 cpetersen May the HA be with me.
16:14 RameshN joined #gluster
16:21 fyxim joined #gluster
16:22 luizcpg joined #gluster
16:28 edong23 joined #gluster
16:34 sagarhani joined #gluster
16:38 Manikandan joined #gluster
16:41 * cpetersen is going to cry.
16:41 cpetersen Split-brain again!
16:42 glusterbot` joined #gluster
16:45 JoeJulian Tell me about your test process?
16:45 JoeJulian ... and are you using quorum?
16:46 cpetersen lol
16:46 cpetersen quorum-type: auto
16:46 cpetersen cluster.quorum-type: auto
16:46 cpetersen cluster.server-quorum-type: server
16:46 cpetersen I haven't brought the node back up yet.
16:47 lupine testing things before rolling them out. man, I wish I got to do that
16:47 JoeJulian Wait... a server is *down* and you have split-brain?
16:47 cpetersen Statistics shows only a heal-failed.  Info shows a split-brain.
16:47 JoeJulian That's got to be a false positive.
16:47 JoeJulian You can't have split-brain with only one brain.
16:48 cpetersen =)
16:48 JoeJulian replica 2, right?
16:48 cpetersen I have achieved the impossible!!!
16:48 JoeJulian You should have saved that for Friday.
16:49 cpetersen Nodes A, B and C
16:49 cpetersen I killed the power on B.
16:49 JoeJulian Let's call them according to ,,(glossary) terms just to avoid confusion.
16:49 glusterbot A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
16:49 JoeJulian servers A, B and C I'm assuming.
16:49 cpetersen Correct.
16:50 JoeJulian (since a "node" could even be a printer)
16:51 JoeJulian replica 3
16:52 cpetersen glustershd.log: http://ur1.ca/oj6i3
16:52 glusterbot Title: #324163 Fedora Project Pastebin (at ur1.ca)
16:52 cpetersen From server A.
16:54 JoeJulian After I've had my coffee I'll read the source and see what that means. gfid mismatch is obvious, but what's the .lck file? Is this something gluster's now creating?
17:01 haomaiwa_ joined #gluster
17:02 julim joined #gluster
17:05 ovaistariq joined #gluster
17:05 jiffin joined #gluster
17:10 plarsen joined #gluster
17:11 atinm joined #gluster
17:12 virusuy Anyone suffering to install a specific version of glusterfs-server on Centos 7 ?
17:13 farhoriz_ joined #gluster
17:15 chirino_m joined #gluster
17:16 rcampbel3 joined #gluster
17:17 JoeJulian What problem are you having?
17:21 virusuy I'm trying to install glusterfs-server-3.6.2 in Centos 7
17:22 virusuy but when i run 'yum install'  fails with this error https://bpaste.net/show/cf37bf2c445c
17:22 glusterbot Title: show at bpaste (at bpaste.net)
17:22 hamiller joined #gluster
17:23 JoeJulian So in /etc/yum.repos.d/CentOS-Base.repo add "exclude=gluster*" to the [base] section.
17:25 virusuy JoeJulian: solved! thanks !
17:25 JoeJulian You're welcome.
17:25 JoeJulian You see why that's needed, right?
17:25 virusuy JoeJulian: seems like the error was about solving some dependencies of glusterfs-server
17:26 virusuy JoeJulian: Yeah, now glusterfs is in Base Repo , right ?
17:26 JoeJulian gluster 3.7 is included in EL7. Since it's a newer version, yum always tried to install the latest.
17:27 JoeJulian To avoid it selecting the newer version, you need to exclude that package from *that* repo so it will use the repo you installed.
17:27 virusuy Yes, gotcha
17:27 virusuy thanks again!
17:29 cpetersen Hey how was coffee?  =)
17:30 JoeJulian Damned cup's empty and I keep looking at it hoping it's filled itself back up.
17:30 cpetersen Heheh.
17:30 JoeJulian It's one of *those* mornings.
17:31 matclayton left #gluster
17:31 cpetersen I'm not positive what that that .lck file is...
17:32 JoeJulian It's not created by gluster.
17:32 RameshN_ joined #gluster
17:32 cpetersen No I think it's vmware, but it doesn't make sense.
17:33 cpetersen I'm mounting the replicated volume from the same spot.  When I take server 1 down, the share never lost connectivity.
17:33 cpetersen It just now needs to coordinate failover.
17:34 cpetersen Somehow vmware causes split-brain with the HA coordination files, the files lock up due to split-brain and the HA fails.
17:35 JoeJulian Ok... gfid mismatch is caused when a file is created independently on two different bricks. Each one created the file and assigned a gfid (inode number, essentially) independently.
17:36 cpetersen The NFS share is mounted from servers A and C, but features.cache-invalidation is on ...
17:36 JoeJulian The only guess I have is that it's creating and deleting rapidly and somehow during the nfs failover the existance of the file is superimposed.
17:36 cpetersen Sorry, not A and C, but D and G.
17:36 siel joined #gluster
17:37 cpetersen So I have a pretty integral problem then.
17:37 JoeJulian It's Schrodinger's lck file.
17:37 swebb joined #gluster
17:37 Logos01 joined #gluster
17:37 JoeJulian Maybe you can use strace or a packet trace to confirm that the create/delete theory is correct.
17:37 cpetersen Pardon me while I google Schrodinger.
17:37 cpetersen ;)
17:38 karnan joined #gluster
17:38 sagarhani joined #gluster
17:38 JoeJulian @lucky Schrodinger's cat
17:38 glusterbot JoeJulian: https://en.wikipedia.org/wiki/Schr%C3%B6dinger's_cat
17:38 jotun joined #gluster
17:39 Lee1092 joined #gluster
17:39 sankarshan_away joined #gluster
17:39 jockek joined #gluster
17:39 Nakiri__ joined #gluster
17:39 ueberall_g joined #gluster
17:39 Logos01 left #gluster
17:39 DV joined #gluster
17:40 dron23 joined #gluster
17:40 JoeJulian But the thing is, you're not the first person to run vmware images on gluster, so why has nobody else complained about this?
17:40 n-st joined #gluster
17:40 lezo joined #gluster
17:41 sghatty_ joined #gluster
17:43 sadbox joined #gluster
17:44 virusuy joined #gluster
17:45 cpetersen Perhaps I am the first to run them on a replicated cluster.
17:45 cpetersen Using nfs-ganesha for HA?
17:46 fyxim joined #gluster
17:51 JoeJulian ganesha, maybe, but definately not the first on replicated.
17:56 sagarhani joined #gluster
17:56 cpetersen What alternatives are there to ganesha for HA?
17:57 cpetersen To be used with gluster rep vols?
17:57 chirino joined #gluster
17:58 JoeJulian Just the build-in nfs.
17:59 cpetersen What's the major difference?
17:59 virusuy Guys, is there any documentation about replace a crashed node in a replicated volume with glusterfs-server 3.6 ?
17:59 JoeJulian hypothetically, what if you ran the ganesha server on the vmware hypervisor?
17:59 virusuy all i found is based on 3.4
18:00 JoeJulian Same thing, replace-brick <crashed brick> <new brick> commit force
18:00 JoeJulian cpetersen: difference is the native nfs is tcp only and version 3.
18:01 kkeithley cpetersen: the gluster nfs is NFSv3 only.  nfs-ganesha is NFSv4 (and v4.1, v3, pNFS)
18:01 haomaiwa_ joined #gluster
18:02 JoeJulian kkeithley: did you see his paste? http://paste.fedoraproject.org/324163/14557278/
18:02 glusterbot Title: #324163 Fedora Project Pastebin (at paste.fedoraproject.org)
18:02 cpetersen Well, the good news was that since that patch, 4.1 works for me now.  The share comes right back up after a failover of the cluster.  I just don't understand why I would be getting a GFID mismatch.  Gluster tracks all changes and NFS 4.1 is stateful.  It doesn't make any sense with cache-invalidation on.
18:03 cpetersen I mean yeah, the theory is that vmware writes it faster and gluster can't keep up.
18:03 cpetersen That's crazy talk.
18:03 virusuy JoeJulian: well, in 3.4 uses other method instead of replace-brick
18:04 kkeithley JoeJulian: no
18:04 kkeithley not until just now. What's that from?
18:05 JoeJulian Replica 3, power-off brick-1
18:05 JoeJulian Power is still off to brick-1 and he gets that gfid mismatch.
18:05 cpetersen It's from my cross-post problem over in ganesha.
18:05 ivan_rossi left #gluster
18:05 julim joined #gluster
18:06 cpetersen Well, brick a, b and c.  Brick B is down still.
18:07 kkeithley cpetersen: nfs-ganesha is an NFS server. It doesn't give any HA per se.  HA is provided by pacemaker and corosync, which is only used to move the public (or virtual) IP of a failed node or a failed ganesha to one of the surviving nodes. When the public IP is moved, the surviving servers reclaim any locks held by the dead ganesha server.
18:07 Manikandan joined #gluster
18:08 cpetersen Right, so Ganesha is basically a modern NFS technology intended to replace NFS 3.  My problem is in how gluster handles the vmware ha files.
18:09 kkeithley gluster doesn't do anything with vmware ha files.
18:09 cpetersen Indeed as far as I can see, the NFS share is behaving appropriately.
18:09 virusuy JoeJulian: but what about if i want to keep the same ip address and the same hostname , copying UUID from the crashed server to the new one should do the trick, right ?
18:09 JoeJulian virusuy: yes.
18:09 virusuy JoeJulian: right
18:09 JoeJulian virusuy: then you'll need to create the brick path and "start...force"
18:10 virusuy JoeJulian: yeap, that brick is already created because bricks are stored in RAID, and OS in separate drives, and one of those drives failed, so i reinstalled OS only
18:10 cpetersen kkeithley:  Joe's theory was that the files with gfid mismatch (which are vsphere-ha directory files) are being written and deleted quicker than gluster can handle it.
18:10 rGil joined #gluster
18:11 JoeJulian I mean it looks like a race condition.
18:11 cpetersen Because the NFS share is mounted on two different vSphere hosts.
18:12 JoeJulian *shouldn't* matter if it's 2 or 200.
18:12 cpetersen Well, 3.
18:15 cpetersen Bricks 1, 2 and 3.  NFS share from brick 1.  VMware servers A, B and C.  A and B have VMs running off of the replicated bricks.  I killed brick 2 and server B at the same time to simulate failure.  As soon as I did I got split-brain.
18:16 rGil I have a problem, i can  mount gluster volumes whit nfs/glusterfs in nodes of gluster cluster and in client i can mount volumes just whit nfs. When i try to mount as glusterfs I receive "Transport endpoint is not connected".
18:16 kkeithley define what you mean by VMs running off of the replicated bricks?  You're not accessing the bricks directly somehow are you?
18:17 rGil Any idea ?
18:17 cpetersen Sorry, no.  Through the NFS share.
18:17 cpetersen A, B and C mount the NFS 4.1 share.
18:17 kkeithley okay.
18:17 cpetersen 1 and B are eliminated.
18:18 cpetersen VM from B should failover to A.
18:18 cpetersen Or C, whichever has compute capacity.
18:20 JoeJulian rGil: That's usually a firewall problem. ,,(ports)
18:20 glusterbot rGil: glusterd's management port is 24007/tcp (also 24008/tcp if you use rdma). Bricks (glusterfsd) use 49152 & up since 3.4.0 (24009 & up previously). (Deleted volumes do not reset this counter.) Additionally it will listen on 38465-38467/tcp for nfs, also 38468 for NLM since 3.3.0. NFS also depends on rpcbind/portmap on port 111 and 2049 since 3.4.
18:21 rGil JoeJulian I put any rule, I already test ports in client node and that works fine.
18:22 kkeithley That's vmware HA.  You killed brick 1, including ganesha on 1 by killing the node.  The pacemaker (gluster/ganesha) HA will move the public IP from brick 1 to brick 2 (or brick 3). You mounted using the public/virtual IP, right?  surviving ganeshas go into NFS-GRACE for 60 seconds after a fail-over. All writes should be suspended for the duration of NFS-GRACE.
18:24 cpetersen Oh so there is a possibility of the client writing to the NFS share even while NFS-GRACE period is active?
18:24 rGil JoeJulian all in the same versions of gluster and any in firewall rules. Monting as type nfs works fine, but type glusterfs no
18:26 JoeJulian "Transport endpoint is not connected" is a network error.
18:26 cpetersen Or all writes are suspended by Ganesha at that time?
18:26 kkeithley cpetersen: I mispoke,  writes aren't blocked during NFS-GRACE.   (only open, lock, remove, rename, setattr AFAIK)
18:27 rGil JoeJulian but i can connect whit nfs, i dont think so
18:27 JoeJulian rGil: paste your client log to fpaste.org for a failed mount attempt.
18:30 cpetersen kkeithley:  There are three floating IPs for the cluster, one for each node.  Traditionally I'm used to there only being one and it floating between nodes.  I am using the floating IP from 1 to mount the NFS share.  Is this wrong?
18:31 cpetersen Additionally, in this scenario, brick 2 was not serving the share on it's floating IP, so no failover on the PCS cluster occurred.
18:31 JoeJulian You're not floating the same IP that the volume hostnames resolve to, right?
18:32 cpetersen No.
18:32 JoeJulian whew
18:33 rGil JoeJulian http://ur1.ca/oj7dh
18:33 glusterbot Title: #324231 Fedora Project Pastebin (at ur1.ca)
18:34 cpetersen I have 3 IPs total.  1, statically assigned to NIC 0 for mgmt.  2, statically assigned layer 2 only on NIC 1 for gluster storage communications, 3, floating IP only on NIC 2 managed by the cluster - cluster not running = NIC not assigned an IP
18:34 shubhendu joined #gluster
18:35 rGil JoeJulian wait a min, not pasted all
18:36 rGil JoeJulian now is right http://ur1.ca/oj7e7
18:36 glusterbot Title: #324233 Fedora Project Pastebin (at ur1.ca)
18:37 cpetersen When I kill brick 2 and server B, VM on server B should failover to server C which consumes the share from brick 1's flaoting UP and no interruption to the cluster or NFS share should be observed.
18:37 cpetersen Sorry, I misspoke previously.  I killed brick 2 and server B, not brick 1 and server B.
18:38 kkeithley Just so I'm clear,  you have three bricks (1, 2, 3). These are not the same machines as servers (A, B, C), right? Or not right?
18:38 JoeJulian Ah, rGil, you need to fix your hostname resolution. See that between lines 5 and 16?
18:38 kanagaraj joined #gluster
18:39 rGil JoeJulian i see that, thx i will try to fix.
18:39 cpetersen You're on to me.  For the sake of not overcomplicating things, they are not the same machines logically.
18:40 rGil JoeJulian thx, solved.
18:40 kkeithley Leaving vmware out for a minute.  On the three bricks you should have gluster running, and ganesha.  And apart from the mgmt NIC 0 and static NIC 1 IPs on each brick, you also have a floating IP managed by pacemaker.  So nine IPs in total.
18:41 cpetersen Correct.
18:41 kkeithley okay, good
18:41 cpetersen =)
18:41 kkeithley so when you kill brick 2, it's floating IP will get moved by pacemaker to another brick.
18:41 cpetersen Correct.
18:41 cpetersen And it does.
18:41 kkeithley good
18:42 cpetersen That part is fantastic because your patch worked great.
18:42 cpetersen NFS is served through brick 1's floating IP, so there is literally no interruption to the share.
18:43 kkeithley cool
18:43 kkeithley so you kill brick 2 and server B.  brick 2's floating IP should move to another machine, but nothing is mounted from brick 2, so this part, at least, should be a no-op.
18:44 cpetersen Correct.
18:44 kkeithley and you kill server B and get a split brain.
18:44 kkeithley somehow
18:44 cpetersen Correct.
18:45 kkeithley because the ganesha server (itself a gluster client) is writing to three replicas
18:45 cpetersen http://ur1.ca/oj7hd = information from gluster v heal * info
18:45 glusterbot Title: #324239 Fedora Project Pastebin (at ur1.ca)
18:45 kkeithley hmmm
18:46 cpetersen Well as far as the client is concerned, it's all pointing to the same logical LUN.  So the only one writing HA information should be the master vmware HA server.
18:46 cpetersen So that's odd to me.
18:47 cpetersen Either that master or the vCenter server that is.
18:47 cpetersen That share is also not used as heartbeat, only to mount VMs for HA.
18:47 cpetersen I immediately got split-brain previously when I attempted to use it for heartbeat.
18:47 cpetersen =)
18:52 kkeithley yeah, I don't know what vmware HA server could be doing that would cause that.
18:56 cpetersen VMware you little bitch...
19:00 theron joined #gluster
19:01 haomaiwang joined #gluster
19:02 kkeithley just because one replica of a replica 3 volume drops out shouldn't cause a split brain.  I'll have to consult with our afr devs in India in the AM
19:03 cpetersen JoeJulian: It's because I don't have any fencing.
19:03 coredump joined #gluster
19:04 cpetersen =]
19:04 ahino joined #gluster
19:14 ovaistariq joined #gluster
19:15 social joined #gluster
19:22 farhoriz_ joined #gluster
19:28 JoeJulian lol
19:33 ovaistariq joined #gluster
19:41 das_j joined #gluster
19:43 das_j Hey, I'm currently setting up a Gluster cluster with SSL enabled on the management path. I have let's encrypt certificates on all machines and I have created a symlink glusterfs.ca -> /etc/letsencrypt/live/mydomain.tld/chain.pem. When I start up the cluster, I get log messages that the ca is unknown: "error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed" What can I do in this
19:43 das_j case?
19:52 cpetersen kkeithley: Interesting.  I changed my VMware datastore heartbeat setting to "Use datastores only from the specified list and complement automatically if needed" to "Use datastores only from the specified list."  I set the failure-interval to 60 seconds, basically that's how long to wait until failover.  I see healing files, but I don't see split-brain.
19:54 cpetersen The key with the first setting is that I have 3 shared datastores, 2 dedicated to heartbeat and the other the actual VM storage.  I was thinking that it may have been dipping in to the VM storage datastore for heartbeating if "complement automatically" was enabled.
19:54 cpetersen It also might be the delay as well.
19:56 cpetersen JoeJulian: What does it mean if I have multiple entries for bricks 1 and 3 displayed on the heal command, but nothing under statistics?
19:58 JoeJulian I'm not really sure. I have the same thing, some 20TB images healing, I can tell that they're actively healing because of the inode lock start location in a state dump, but statistics still shows nothing happening.
19:58 rwheeler joined #gluster
19:59 JoeJulian I was hoping to get a chance to ask pranithk about that one of these nights.
19:59 cpetersen Yikes, that's a lot of data to be unsure of!
19:59 JoeJulian Tell me about it.
20:00 cpetersen Your faith hath not been shaken?
20:01 JoeJulian "So what size images should we allow?" "No more than 4TB. Beyond that it's going to mess with distribution and heals will take forever." "Ok, we'll give the customer 20TB images."
20:01 JoeJulian :head desk:
20:01 haomaiwa_ joined #gluster
20:01 cpetersen The weird part in my case is that the files are not large at all.  They are actually super small.
20:02 cpetersen I ran a manual heal several times but that's all it says.  Glustershd log has nothing on it.  lol
20:02 JoeJulian 60TB bricks. I'm not pleased.
20:02 cpetersen I should check out my brick log.
20:02 JoeJulian mkdir /var/run/gluster; pkill -USR1 glusterfsd; less /var/run/gluster/* +self-heal
20:03 cpetersen ?!?!!
20:03 JoeJulian If it finds anything, then heals are happening. You can look at the "start=" in the section to see where it is.
20:06 cpetersen Won't pkill terminate that process and stop healing?
20:09 cpetersen I ran "less /var/run/gluster/* +self-heal" but nothing came back.  It just sent me in to lost+found but nothing else beyond some .socket files not being regular files.
20:10 cpetersen Odd that it would say it's healing though it's clearly not.
20:11 JoeJulian pkill -USR1 sends SIGUSR1 to the process which it is programmed to interpret as "dump a system state for this process to /var/run/gluster"
20:11 SpeeR joined #gluster
20:14 uebera|| joined #gluster
20:14 uebera|| joined #gluster
20:15 cpetersen OK I ran the whole thing, same result.
20:18 cpetersen http://ur1.ca/oj87y
20:18 glusterbot Title: #324312 Fedora Project Pastebin (at ur1.ca)
20:31 cpetersen OIC.
20:32 cpetersen I brought the brick that was down back up again and the heal messages went away.  Supposedly they were there because they were to be written to the delinquent brick.
20:32 cpetersen Waha!
20:32 JoeJulian Ah, yes. I'm sorry I assumed you had already brought it back up.
20:33 cpetersen :O
20:33 cpetersen That means I just had a successful test of failover.
20:33 * cpetersen knocks on wood and punches face
20:34 JoeJulian [insert image of kermit the frog cheering] Yay!
20:35 cpetersen Now to run several more similar tests that will upset me greatly.
20:39 kkeithley cpetersen: sounds like progress then
20:41 cpetersen Yeppers.
20:42 ovaistariq joined #gluster
20:45 mhulsman joined #gluster
20:46 B21956 joined #gluster
20:46 mhulsman joined #gluster
20:50 merp_ joined #gluster
20:52 merp_ Wondering if anyone can help me with this problem, i've got three servers with a replicated gluster volume (1 x 3) and after upgrading to 3.7.8 my write performance has dropped to 1-4mB/sec (gigabit network between them)
20:52 merp_ i'm at a loss for how to debug this
20:52 merp_ the storage devices themselves are fast (600mB/sec), the network is unsaturated (800mbps with iperf between each server)
20:53 merp_ when mounting as a fuse filesystem the slow write performance manifests, if i mount as NFS its not a problem
20:53 merp_ unfortunately i can't use nfs for production apps
21:01 haomaiwa_ joined #gluster
21:02 ctria joined #gluster
21:09 theron joined #gluster
21:10 BuffaloCN joined #gluster
21:17 BuffaloCN joined #gluster
21:22 merp_ this may be some kind of bug in glusterfs 3.7.8, i've created two new test clusters, one in rackspace and one in amazon aws
21:22 merp_ write performance is the same (1-4mB/sec) for a simple "dd if=/dev/zero of=bwtest bs=1M count=128"
21:23 merp_ with the previous 3.7 version (3.6.5?) i was getting 60-100mB sec with the exact same hardware/configuration
21:23 merp_ er 3.7.5
21:37 JoeJulian merp_: If it's that reproduceable, please file a bug report. I, personally, haven't upgraded from 3.7.6 to 3.7.8 yet so I haven't seen any issues.
21:37 glusterbot https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
21:37 merp_ filing one now
21:37 JoeJulian Check your logs, of course, for clues.
21:37 merp_ yeah, no luck there; all quiet on the western front
21:38 JoeJulian Figured
21:39 merp_ https://bugzilla.redhat.com/show_bug.cgi?id=1309462
21:39 glusterbot Bug 1309462: high, unspecified, ---, bugs, NEW , Upgrade from 3.7.6 to 3.7.8 causes massive drop in write performance.  Fresh install of 3.7.8 also has terrible write performance
21:40 merp_ How do the people in this channel feel about the overall stability of GlusterFS 3.7?
21:40 merp_ we've been having bad luck with it, 3.7.6 was leaking memory like crazy
21:41 JoeJulian A word of advice when writing bug reports (or reporting problems in general), try to avoid opinion words, like bad, massive, terrible, painful, etc. and use quantifieable data instead. (my 2c)
21:41 merp_ okay, thats fair; going to update it
21:42 JoeJulian As for stability, I haven't hit the memory leaks, but post-factum has done a ton of work identifying them so they could be fixed in 3.7.8.
21:43 hagarth merp_: can you please disable write-behind and check performance?
21:45 merp_ that had a huge effect
21:46 hagarth merp_: as in?
21:46 merp_ performance is back to what it was before the upgrade
21:46 merp_ testing on a few different systems
21:48 merp_ this is great, thanks hagarth!
21:49 merp_ that seems to have resolved the issue in all my environments
21:50 * JoeJulian pictures servers in a rainforest, desert, tundra....
21:59 virusuy lol
22:01 haomaiwa_ joined #gluster
22:01 virusuy is there any case where write-behind isn't recommended ?
22:02 JoeJulian When it's super slow? ;)
22:02 virusuy lol
22:05 rcampbel3 joined #gluster
22:07 dlambrig_ joined #gluster
22:10 hagarth virusuy: write-behind should not ideally cause super slowness..on the contrary it should aide performance
22:19 merp_ @joejulian, yeah our testing was focused mainly on the memory leaks between 3.7.6 and 3.7.8.  3.7.8 seems to have fixed all the cases where it was leaking in our environment
22:20 merp_ somehow we missed the write performance problem :(
22:20 dlambrig_ left #gluster
22:21 merp_ prior to the update we had to manually kill a glusterfs process about once per week (it would just grow until OOM)
22:24 cpetersen JoeJulian: Question, is there any way to make replication through gluster be like RAID 1?  To elaborate, rather than writing to brick 1 then cascading to brick 2, could you throttle write speed so that everything is written simultaneously as live as possible?
22:24 SpeeR if I have a 3 node cluster with 2 bricks each node , replicated 2 times, and I add another node with 2 bricks... first, is that possible, and second does that make it a distributed replica at that point?
22:26 JoeJulian cpetersen: it already is that way. The client writes to the replicas simultaneously.
22:30 SpeeR I'm thinking I need to add 2 nodes instead of 1 since that is the amount of replicas
22:34 gildub joined #gluster
22:40 theron joined #gluster
22:42 hagarth merp_: what is your use case?
22:43 cpetersen Hmmm, ok. :)
22:44 cpetersen The heal info command is a bit weird.  When it says there are files, those are to be replicated to the brick when it comes back up.  When it says "possibly undergoing healing," that is when it is actually healing between nodes.
22:44 cpetersen If I understand correctly.
22:44 merp_ hagarth, somewhat suboptimal for Gluster but we have a ton of sqlite files that are modified fairly often (small file operations)
22:46 post-factum JoeJulian: not all memleaks are fixed in 3.7.8, waiting for 3 more patches to be merged into 3.7.9
22:48 hagarth merp_: ok
22:48 post-factum btw, what's going on with write-behind in 3.7.8?
22:49 ovaistar_ joined #gluster
22:49 hagarth post-factum: I suspect it to be a side-effect of https://github.com/gluster/glusterfs/commit/3fcead2de7bcdb4e1312f37e7e750abd8d9d9770
22:49 glusterbot Title: performance/write-behind: retry "failed syncs to backend" · gluster/glusterfs@3fcead2 · GitHub (at github.com)
22:49 hagarth but haven't had the time to look into that in detail
22:51 kovshenin joined #gluster
22:52 post-factum meh. i was going to deploy 3.7.8 tomorrow on several vms :(
22:52 hagarth post-factum: ouch :(
22:53 post-factum i guess i will test write speed before that as i have write-behind enabled as well
22:53 post-factum could that commit be reverted?
22:53 post-factum i mean, by myself
22:53 post-factum with no side effecs
22:53 post-factum s/effecs/effects/
22:53 glusterbot What post-factum meant to say was: with no side effects
22:54 post-factum or it is enough just to set performance.resync-failed-syncs-after-fsync to 0?
22:55 hagarth post-factum: let me confirm my hypothesis
22:56 post-factum okay
22:57 Logos01 joined #gluster
22:58 Logos01 Howdy, folks.  I need to disable the TLS encryption on a number of volumes. I've turned off the client.ssl and server.ssl parameters, but I can no longer mount the clients after setting "auth.ssl-allow" to "off". Can someone point me to the proper procedure for this?
23:04 Logos01 Hrm. Seems I needed to restart the volumes... even though I rebooted every brick host. Odd.
23:09 n-st joined #gluster
23:21 theron joined #gluster
23:21 cpetersen FFS, I tested failing brick 1 and server A.  Floating ip changed properly, ESX won't bring the share back up.
23:33 caitnop joined #gluster
23:55 Logos01 cpetersen: That can be an issue when the management connection is no longer viable -- it requires management connection to be current for volume configurations to be propagated to clients.
23:55 Logos01 That's my understanding.
23:55 Logos01 I ran into something relatively similar with my attempted use of a floating IP
23:56 Logos01 If you *CAN* to rrdns that's probably a better option.

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary