Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2017-11-01

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:21 plarsen joined #gluster
00:41 Len left #gluster
00:56 Jacob843 joined #gluster
01:03 vbellur joined #gluster
01:12 baber joined #gluster
01:38 vbellur joined #gluster
02:04 msvbhat joined #gluster
02:56 ilbot3 joined #gluster
02:56 Topic for #gluster is now Gluster Community - https://www.gluster.org | Documentation - https://gluster.readthedocs.io/en/latest/ | Patches - https://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - https://botbot.me/freenode/gluster/ & http://irclog.perlgeek.de/gluster/
03:21 boutcheee520 joined #gluster
03:22 koolfy joined #gluster
03:23 boutcheee520 Hello everyone, I was wondering if someone could help me out with an error I am receiving.
03:24 boutcheee520 I have a very old server in my dev environment, running RHEL 6.9 and has gluster 3.7 installed on it. I just spun up a new CentOS-7 server, installed Gluster 3.12 on it and want to join this new server to the existing pool
03:25 boutcheee520 After turning off SELinux on my new server and adding a rich rule in my firewalld zone, I run: gluster peer probe <new_server_hostname>
03:26 boutcheee520 I receive a "peer probe success." but then when I run, gluster peer status, the state of my new server shows "Peer Rejected (Connected)"
03:27 boutcheee520 ^ doesn't seem "normal" to me? I'm pretty new to gluster and have never tried setting it up before, so I am learning as I go...
03:27 boutcheee520 Thanks in advance
03:46 psony joined #gluster
04:01 atinm joined #gluster
04:22 javi404 joined #gluster
05:44 ahino1 joined #gluster
05:45 msvbhat joined #gluster
06:16 armyriad joined #gluster
06:17 omie888777 joined #gluster
06:17 shyu joined #gluster
06:32 armyriad joined #gluster
06:43 major joined #gluster
06:58 jkroon joined #gluster
07:20 mwaeckerlin joined #gluster
07:38 marbu joined #gluster
07:38 rouven joined #gluster
07:49 rouven joined #gluster
08:04 omie888777 joined #gluster
08:26 DV joined #gluster
08:32 rwheeler joined #gluster
08:54 marbu joined #gluster
09:06 buvanesh_kumar joined #gluster
09:47 mbukatov joined #gluster
10:38 rouven joined #gluster
11:02 rouven joined #gluster
11:02 rouven left #gluster
11:05 baber joined #gluster
11:19 TBlaar joined #gluster
11:43 ThHirsch joined #gluster
12:39 gyadav__ joined #gluster
12:48 marbu joined #gluster
12:48 phlogistonjohn joined #gluster
12:54 ThHirsch joined #gluster
13:04 kramdoss__ joined #gluster
13:07 baber joined #gluster
13:32 plarsen joined #gluster
13:34 melliott joined #gluster
13:42 DV joined #gluster
13:45 hmamtora joined #gluster
13:54 kramdoss__ joined #gluster
14:02 farhorizon joined #gluster
14:16 phlogistonjohn joined #gluster
14:30 Acinonyx joined #gluster
14:38 Shu6h3ndu joined #gluster
14:42 bwerthmann joined #gluster
14:45 omie888777 joined #gluster
14:53 smohan[m] joined #gluster
14:55 kpease_ joined #gluster
15:03 wushudoin joined #gluster
15:12 bwerthma1n joined #gluster
15:17 vbellur joined #gluster
16:24 Twistedgrim joined #gluster
16:31 atrius joined #gluster
16:35 DV joined #gluster
16:44 baber joined #gluster
16:52 Shu6h3ndu joined #gluster
16:54 buvanesh_kumar joined #gluster
17:12 jkroon joined #gluster
17:20 int-0x21 joined #gluster
17:20 int-0x21 Hi
17:20 glusterbot int-0x21: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer
17:21 int-0x21 Im setting up a new HA datastore for our vmware enviroment and also iscsi target for mssql databases, Does anyone use gluster for something similar ?
17:22 int-0x21 As a initial test im planing on doing 2 hosts with 5 960G nvme disk and 10 3TB 7200 sata disks i will also add a thierd host to be arbiter
17:24 int-0x21 Im thinking of utilising zfs and do 2 pools, one 3+1 nvme replicate + arbiter and one striped 4+1 with nvme slog in replicate + arbiter
17:24 int-0x21 Then do multipath iscsi to the 2 storage servers (not to the arbiter)
17:25 int-0x21 Does this sound like something that will work ?
17:25 int-0x21 I have found it hard to find good performance numbers for gluster and "block" scenarios
17:27 bwerthmann joined #gluster
17:32 NoctreGryps joined #gluster
17:33 NoctreGryps Greetings Gluster community. I find myself at a loss. I have inherited a distributed-replicated 2x2 system that not all nodes in the cluster see each other, and as it is our production system I can't just stop all the services, attempt to get them peering correctly, and pray that it didn't make the problem worse. So I've tried setting up a clean, brand new installation of gluster on new VMS, and I can't even get a mount to be su
17:34 NoctreGryps copy data over in anticipation to move everything to a "healthy" system. Is there gluster for dummies documentation somewhere, because even following the docs.gluster.org I'm having problems?
17:35 NoctreGryps I tried approaching this chat room in the last several weeks to find information on how to heal/fix the unhealthy set of nodes, but not much luck there.
17:37 NoctreGryps Specifically, I'm looking for information on non-happy-path peer states, and their troubleshooting; or else for information on why a gluster mount activity would fail indicating subvolumes are down when I haven't created subvolumes (this is at the point of trying to mount a cluster node onto a cloned VM that has all the data that the original gluster installation has).
17:40 buvanesh_kumar joined #gluster
17:46 JoeJulian int-0x21: I've heard of other people doing that, yes. I have not seen any performance info.
17:47 ThHirsch joined #gluster
17:47 int-0x21 Its defently a jungle trying to create a decent storage system for vmware :) information is mostly availible for small lab setups not for enterprise solutions
17:48 JoeJulian NoctreGryps: I'm sorry you've had poor results here. Most of us were in Prague for a conference.
17:48 int-0x21 Im trying to replace our hp lefthand systems since they are way to expensive for poor performance
17:48 JoeJulian int-0x21: yeah, that's one of the great joys of using something that costs a lot of money. They're invested in making it really hard to use anything that makes it easier.
17:50 int-0x21 I will start testing gluster tomorrow when delivery of the storage blocks i described above arived but was hoping to get some tips from someone who done it before
17:50 JoeJulian NoctreGryps: So your first paragraph cut off at "mount to be suc" as IRC has a line-length limit.
17:51 JoeJulian int-0x21: I should be around either way during the day (GMT-7). Though I haven't done the iscsi target, I do know the gist of what needs to be done to make that happen.
17:51 int-0x21 So far i tested ceph but block storage performance is horrid once it goes to iscsi so its not ready for vmware (cant realy replace all hosts with kvm or something)
17:51 JoeJulian NoctreGryps: Most of troubleshooting non-happy peers is looking in the glusterd.log files.
17:52 JoeJulian int-0x21: what's "horrid"?
17:52 JoeJulian iops? throughput?
17:52 int-0x21 NFS is defently a alternetive that i might even prefer for the VMFS datastore, but for the MSSQL failover cluster i sadly cannot use NFS so i might be forced to have iscsi  for that purpose
17:53 JoeJulian I know a _lot_ of people use nfs for vmware.
17:53 JoeJulian You can do both.
17:53 int-0x21 Both actualy, i had throughput of 15MB/sec and timeouts when localy testing the ceph it was 600
17:54 int-0x21 So i think the issue was mostly that the iscsigateway was still not production ready, ceph might have been a alternetive in 2-3 years but not now
17:54 int-0x21 I went on to create a pacemaker failover zfs storage with nfs and iscsi but there is some issues in stability there
17:55 * JoeJulian shudders
17:55 int-0x21 The timeouts of removing a disk couses entire cluster to go down and so on
17:55 int-0x21 hence i was thinking of letting zfs just do the raids for bricks (so i get nvme ontop of the spinners) and allow a faild disk on each brick
17:55 int-0x21 and then gluster for the cluster filesystem
17:56 JoeJulian seems reasonable
17:56 int-0x21 Also been considering bcache but that dosnt give me the benifit of allowing a disk to go down in a brick and that leaves me vulnarble if a host is down and one disk has bad block sort of thing
17:56 JoeJulian I still think I would use native NFSv3 for vmware and only set up the iscsi targets for MSSQL.
17:56 NoctreGryps Republishing:
17:57 NoctreGryps Greetings Gluster community. I find myself at a loss. I have inherited a distributed-replicated 2x2 system that not all nodes in the cluster see each other, and as it is our production system I can't just stop all the services, attempt to get them peering correctly, and pray that it didn't make the problem worse.
17:57 NoctreGryps So I've tried setting up a clean, brand new installation of gluster on new VMS, and I can't even get a mount to be successful under version 3.10 so I can
17:57 NoctreGryps copy data over in anticipation to move everything to a "healthy" system. Is there gluster for dummies documentation somewhere, because even following the docs.gluster.org I'm having problems?
17:57 int-0x21 Yea thats what i think i will do actualy since i get better NFS performance then iSCSI performance on all my tests
17:57 JoeJulian Fewer context switches.
17:58 int-0x21 How come not NFS4 though since that support multipathing in vmware
17:58 int-0x21 Then i dont need to bother about pacemaker and floating VIP and souch
17:58 Peppard joined #gluster
17:58 JoeJulian Because I don't use vmware and I didn't realize it supported multipathing.
17:58 int-0x21 Ah :)
17:58 JoeJulian that will require ganesha
17:59 NoctreGryps_ joined #gluster
17:59 JoeJulian Still a good solution.
17:59 int-0x21 Yea i want to keep the failover scenarios as clean as possible
18:00 int-0x21 Kernel exports NFS4 though ? is ganesha realy needed ?
18:00 JoeJulian NoctreGryps_: What's the mount error? (see the log file)
18:01 JoeJulian (/var/log/glusterfs/<mount-path-with-slashes-changed-to-dashes>.log)
18:02 int-0x21 I considerd Lustre also but i think it puts to much need on jbods with sas drives for multipathing and failover controllers (i dont like active/passive failover stuff its always what crashes)
18:03 JoeJulian It's also well known to lie about whether or not the data was actually written.
18:03 NoctreGryps_ Unhealthy sub-volumes being found; the kicker is, this is a brand new, clean installation. I am literally trying to mount an empty base directory to start transfer of data.
18:04 JoeJulian make a clean mount file and ,,(paste) it, please.
18:04 glusterbot For a simple way to paste output, install netcat (if it's not already) and pipe your output like: | nc termbin.com 9999
18:05 int-0x21 Well thanks for the input JoeJulian i will start testing tomorrow, If anyone has any ha iscsi multipathing experience il be glad for tips
18:07 int-0x21 If it works well and i can production aprove it il post a writeup about it (Way to hard to find information about something that is so central and crucial to every vmware shop)
18:09 JoeJulian That would be awesome. Let us know and we'll spread the word.
18:11 int-0x21 Will do :)
18:15 buvanesh_kumar joined #gluster
18:18 NoctreGryps_ The first non-Info level event is this, at Warn: Directory selfheal failed: 1 subvolumes down.Not fixing. path = /, gfid =
18:18 NoctreGryps_ There are no Error level events in that log.
18:41 JoeJulian NoctreGryps_: That may be a recoverable state. Hard to say from here.
18:53 msvbhat joined #gluster
19:04 jkroon joined #gluster
19:22 gbox JoeJulian: welcome back!  It was bleak here for the last few weeks--just when I decided to upgrade to 3.12.  Gave me time to check the irc logs for info though.
19:22 glusterbot gbox: weeks's karma is now -1
19:24 boutcheee520 joined #gluster
19:24 gbox JoeJulian:  You seem to advocated of 3x replica with or without arbiter, but those 2 options seem very different.  Does going without arbiter provide any benefit other than an extra full copy of each file?
19:27 boutcheee520 Hello everyone, I was wondering if someone could help me out with an error I am receiving.
19:28 boutcheee520 I have a very old server in my dev environment, running RHEL 6.9 and has gluster 3.7 installed on it. I just spun up a new CentOS-7 server, installed Gluster 3.12 on it and want to join this new server to the existing pool
19:28 boutcheee520 After turning off SELinux on my new server and adding a rich rule in my firewalld zone, I run: gluster peer probe <new_server_hostname>
19:28 boutcheee520 I receive a "peer probe success." but then when I run, gluster peer status, the state of my new server shows "Peer Rejected (Connected)"
19:28 boutcheee520 ^ doesn't seem "normal" to me? I'm pretty new to gluster and have never tried setting it up before, so I am learning as I go..
19:28 boutcheee520 Thanks in advance!..
19:28 gbox boutcheee520: everything I have seen discourages mixing peers with different versions
19:29 gbox boutcheee520: having the new node as a client would work.  Do you plan to upgrade everything?
19:30 boutcheee520 ideally, have 2 new servers running CentOS-7 with the latest version of Glusterfs on them. Then migrate the bricks from the old servers to the new ones
19:31 boutcheee520 I think RHEL 6.9 only supports up to Gluster 3.10 ..? If I am not mistaken
19:33 boutcheee520 basically in my dev environment, my gluster bricks are 40G but in PROD is like 7+ terabytes so I want to figure out the "Best" upgrade path to do this
19:33 gbox boutcheee520:  I have exactly the same situation as you, and am very carefully migrating data over
19:34 gbox I've seen some crazy suggestions, like kill the gluster volume leaving the data on the bricks, upgrade, create a new volume, and then hope healing puts everything back in place.  Can't imagine that would work though
19:35 gbox 3.10 is a lot closer to 3.12 than 3.7 though
19:36 gbox boutcheee520: are you considering any of the new features like arbiter or tiering?
19:38 gbox I kind of see tiering as a way to add a new volume to an existing one, with the new volume being the preferred one for data.  The cold tier could essentially be a fallback.  Most of the docs describe it differently though, as if the hot tier needs to be fast & small
19:39 gbox Tiering question:  If one tier goes down does the data on the other tier remain accessible?  I would guess yes?
19:41 boutcheee520 ^ have not even considered tiering. Very new to actually gluster honestly. I'm learning as I go ha...
19:42 boutcheee520 good point on the 3.10 upgrade since it is closer to 3.12
19:42 boutcheee520 I was hoping I could skip that so I wouldn't have to upgrade twice?
19:43 gbox boutcheee520: You've got a big new space to use.  Why not set up a new volume, copy the data over, clean up the old space, then figure out how to incorporate it?
19:44 boutcheee520 upgrade old servers to 3.10 , install 3.10 on the new servers , join new servers to existing "cluster" , migrate data to new volumes on the new servers , then upgrade from 3.10 to 3.12 on the new servers
19:44 boutcheee520 Hmmm, yeah I might try that out
19:48 gbox boutcheee520: No treat them as separate systems.  Create a new 3.12 volume on the new servers, copy (rsync, etc) data from 3.7 to the 3.12.  Then you can do whatever with the old dev systems
19:49 boutcheee520 Ahhh okay, cool. Definitely going to need to start a screen session so my connection does not die
19:49 boutcheee520 thanks for the advice
19:49 gbox boutcheee520: sure that's the cleanest way to go!
19:51 vbellur joined #gluster
19:52 vbellur1 joined #gluster
19:53 vbellur joined #gluster
20:01 boutcheee520 sudo rsync -aAXv /source/path /destination/path
20:02 boutcheee520 so preserve ACLs, extended attributes, archive and verbose
20:02 boutcheee520 what can go wrong rsyncing 7+ terabytes lol
20:08 Neoon joined #gluster
20:18 gbox boutcheee520: The gluster ACLs won't migrate, so unless you have other ACLs you don't need that
20:19 gbox boutcheee520: there's a webpage out there where someone explains how to do multiple rsyncs in parallel to really take advantage of gluster
20:20 gbox boutcheee520: with a decent network you can run 7 rsync's in parallel and saturate the network
20:20 vbellur joined #gluster
20:20 gbox boutcheee520: the trick is to chop up your data, but 7TB isn't all that much
20:22 vbellur1 joined #gluster
20:26 boutcheee520 touche
20:26 boutcheee520 I keep seeing a program called parallel be used for this to work
20:30 gbox boutcheee520: Yeah that would work too if you just mount the 3.7 from the 3.12
20:31 gbox boutcheee520: Then just do a find ... | parallel cp {} /new/volume
20:31 gbox boutcheee520: You'll have to learn more about parallel, it's very useful with gluster in general
20:34 boutcheee520 another thing to add to the list ha
20:41 vbellur joined #gluster
20:53 russoisraeli joined #gluster
20:54 russoisraeli Hello folks. I have a replicated volume of 3 replicas. I would like to change it to a stripe of 2 replicas (4 peers). How can I change this?
21:07 JoeJulian gbox: No, but I generally end up with SLAs that require 6 nines which you get (with even aws's SLA) by using 3 full replica.
21:08 JoeJulian boutcheee520: replicas give HA automatically.
21:08 russoisraeli sorry... 3 replicas -> distributed replicated
21:08 russoisraeli (of 4 bricks)
21:09 JoeJulian russoisraeli: I have a blog post about doing something like that... one sec while I dig through my blog...
21:09 russoisraeli JoeJulian - many thanks!
21:09 JoeJulian https://joejulian.name/blog/how-to-expand-glusterfs-replicated-clusters-by-one-server/
21:09 glusterbot Title: How to expand GlusterFS replicated clusters by one server (at joejulian.name)
21:10 JoeJulian At Gluster Summit there was talk about how, in the future, they've figured out a way to do that much less agressively and in an automated way.
21:10 JoeJulian It will be really nice to see that.
21:11 vbellur joined #gluster
21:12 MrAbaddon joined #gluster
21:16 rwheeler joined #gluster
21:18 russoisraeli JoeJulian - I am sorry - I am a bit unclear - in the example, there was a replica of 2. What was the result? Replica distributed with a single box?
21:19 russoisraeli and, does replace-brick create .... a copy?
21:20 JoeJulian If you had a replica 2 on two servers (one brick each) and added two more servers without changing the replica count, you would end up with a distribute-replica volume. The more replica sets you add, the more subvolumes distribute has.
21:20 russoisraeli In my case it sounds a bit different - I have 3 replica's. 1 of them needs to join another box to create another replica, and then distribute the stuff over both
21:20 russoisraeli Right.... but in the example, only one server is added
21:21 JoeJulian Right because people keep asking how to do that, I came up with a strategy.
21:21 JoeJulian So...
21:22 JoeJulian You have 3 servers in a replica 3. You want to add 1 box and keep your replica count at 3?
21:22 russoisraeli no, I would like to have it just like in the example in the docs
21:22 russoisraeli 2 replicas, and the data distributed over them
21:22 JoeJulian Oh, so 2x2. Well that's easy.
21:23 JoeJulian gluster volume remove-brick replica 3 <one of the bricks>
21:23 JoeJulian gah
21:23 JoeJulian typo
21:23 JoeJulian gluster volume remove-brick replica 2 <one of the bricks>
21:23 JoeJulian That will leave you with a replica 2 volume on 2 servers.
21:23 JoeJulian format the brick you removed.
21:24 JoeJulian gluster volume add-brick <brick you formatted> <new brick>
21:24 JoeJulian then rebalance
21:24 russoisraeli ah, so simply add 2 bricks, once one is removed
21:24 JoeJulian right
21:25 russoisraeli format - just mkfs.xfs on the partition? or something less ?
21:25 JoeJulian yeah, that's what I would do. You could rm -rf or something similar, but I'm lazy and like to do things more quickly. ;)
21:25 russoisraeli many thanks. I really appreciate it
21:26 JoeJulian You're welcome.
21:29 wushudoin joined #gluster
21:32 russoisraeli this was painless :)
21:33 JoeJulian +1
21:40 jbrooks joined #gluster
22:14 map1541 joined #gluster
22:25 msvbhat joined #gluster
22:43 MrAbaddon joined #gluster
22:54 protoporpoise joined #gluster
22:55 gbox Tiering question: If one tier goes down does the data on the other tier remain accessible? I would guess yes?
22:56 JoeJulian gbox: I haven't looked at how that graph is built. My guess would be yes, too.
22:57 protoporpoise Howdy all, Just found an odd issue with a client connecting to 3.12, on the clients, when they ls -la in a gluster mounted directory it returns total 0 (no files or directories), but if you cd into a directory you know exists in there and ls you can see the files in there - it's as if somehow the client isn't realising that files have been created in a way that tools such as ls then see them
22:57 protoporpoise if you create a new file (touch) and ls, it does show up
22:57 JoeJulian gbox: You can look at the graph built in /var/lib/glusterd/vols/$volname to see how it's built.
22:57 gbox JoeJulian: Thx!  I know graph theory from college.  How do you review the gluster graph?
22:58 JoeJulian Each translator links to one or more translators below it. It's in the .vol files.
22:59 protoporpoise I was wondering with my issue, if perhaps it could be something like gluster not flushing directory metadata to disk or something odd like that, perahps to do with the write-behind-window-size or similar?
23:01 omie888777 joined #gluster
23:05 JoeJulian I've seen that once before, protoporpoise, with a 3.11 version. I upgraded my servers to 3.12 and the problem went away.
23:05 JoeJulian So I never actually diagnosed it.
23:21 protoporpoise very interesting...
23:21 protoporpoise I wonder if the client op-version is wrong somehow
23:23 JoeJulian I was wondering that, too. I suppose if I wanted to diagnose it further I would have started by using wireshark to see if the rpc call asked for the directory and what the response was. If the response contained the directory entries then I would look at the client-side translators.
23:23 JoeJulian I did try turning off all the performance translators to no effect.
23:23 protoporpoise hmm, looks like the servers and the clients are all set to 31200
23:24 protoporpoise 0-management: RPC_CLNT_PING notify failed
23:24 protoporpoise I see that in the logs on the servers a lot, but there is a LOT of noise in the gluster logs so I'm never quite sure what to ignore
23:24 JoeJulian I understand.
23:24 protoporpoise I've also recently seen 0-management: Error setting index on brick status rsp dict
23:25 protoporpoise and: 0-dict: !this || !value for key=index [Invalid argument]
23:25 JoeJulian Usually I only worry about ' E ' (errors) then use the rest of the noise as more of a backtrace.
23:26 JoeJulian It's almost 4:30 and I said I was going to have some work done by 5 or I'd dig in to the source to see if I could tell you what that means.
23:26 protoporpoise OK so the only error level message's I see is: [glusterd-syncop.c:1014:gd_syncop_mgmt_brick_op] 0-management: Error setting index on brick status rsp dict" repeated 2 times...
23:26 protoporpoise oh ok no problems, appreciate you replying at all :)
23:26 JoeJulian :D
23:27 JoeJulian I'm already a week late on this PR since I was away at Prague.
23:27 vbellur joined #gluster
23:27 protoporpoise it's a hard life ;)
23:27 protoporpoise thanks for your time though, good luck :)\
23:41 plarsen joined #gluster
23:43 protoporpoise I wonder if this problem could be caused by #1490493 which was fixed with 3.12.2 which I'm just waiting on RPMs for - http://docs.gluster.org/en/latest/release-notes/3.12.2/
23:43 glusterbot Title: 3.12.2 - Gluster Docs (at docs.gluster.org)
23:43 protoporpoise we are mounting subdirectories so it's possible
23:54 bwerthmann joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary