Camelia, the Perl 6 bug

IRC log for #gluster, 2012-12-26

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:12 robo joined #gluster
00:37 yinyin joined #gluster
00:56 robo joined #gluster
01:11 a2 joined #gluster
02:00 yinyin joined #gluster
02:40 yinyin joined #gluster
03:33 shylesh joined #gluster
03:38 bala joined #gluster
03:47 maxiepax joined #gluster
03:49 yinyin joined #gluster
04:19 sunus joined #gluster
04:28 ramkrsna joined #gluster
04:28 ramkrsna joined #gluster
04:28 sunus joined #gluster
04:29 jermudgeon joined #gluster
04:30 raven-np joined #gluster
04:31 vpshastry joined #gluster
04:33 greylurk joined #gluster
04:50 yinyin joined #gluster
04:55 sgowda joined #gluster
04:57 bulde joined #gluster
05:16 yinyin joined #gluster
05:37 rastar joined #gluster
05:41 Humble joined #gluster
05:48 mohankumar joined #gluster
05:49 yinyin joined #gluster
05:56 raven-np joined #gluster
06:01 raghu joined #gluster
06:46 raven-np1 joined #gluster
06:47 raghu joined #gluster
06:48 raven-np joined #gluster
06:49 yinyin joined #gluster
06:50 raven-np2 joined #gluster
06:52 raven-np1 joined #gluster
06:54 raven-np joined #gluster
06:59 raven-np1 joined #gluster
07:01 raven-np joined #gluster
07:03 raven-np joined #gluster
07:13 raven-np joined #gluster
07:30 glusterbot New news from resolvedglusterbugs: [Bug 840737] After the upgrade from glusterfs 3.2 to 3.3, the content of stipe and distributed-stripe is not available in mounted directory <http://goo.gl/k1GUb>
07:34 raven-np1 joined #gluster
07:44 guigui3 joined #gluster
07:45 yinyin joined #gluster
07:47 ekuric joined #gluster
08:32 ramkrsna joined #gluster
08:32 ramkrsna joined #gluster
08:57 vimal joined #gluster
09:01 kevein_ joined #gluster
09:02 clag_ joined #gluster
09:42 Humble joined #gluster
09:46 ramkrsna joined #gluster
09:51 vpshastry joined #gluster
10:00 shireesh joined #gluster
10:27 nullck joined #gluster
10:30 glusterbot New news from resolvedglusterbugs: [Bug 823242] Add-brick to ditributed-replicate volume makes directories invisible for sometime <http://goo.gl/GG7BX> || [Bug 823404] I/O fails on the mount point while remove brick migrates data and committed <http://goo.gl/exEqy> || [Bug 787258] RDMA-connected clients drop mount with "transport endpoint not connected" <http://goo.gl/QcDbm>
10:35 hurdman left #gluster
10:50 yinyin joined #gluster
11:05 vpshastry joined #gluster
11:27 inevity joined #gluster
11:37 rastar joined #gluster
11:41 joeto joined #gluster
11:42 inodb joined #gluster
11:50 yinyin joined #gluster
12:07 bulde joined #gluster
12:20 yinyin joined #gluster
12:21 vpshastry joined #gluster
12:21 raven-np joined #gluster
12:42 rastar joined #gluster
13:08 yinyin joined #gluster
13:24 raven-np joined #gluster
13:33 chirino joined #gluster
13:52 guigui1 joined #gluster
13:53 zhuyb joined #gluster
13:53 zhuyb hi
13:53 glusterbot zhuyb: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
13:58 zhuyb [2012-12-25 10:34:54.102459] W [afr-common.c:1121:afr_conflicting_iattrs] 24-jss-r2-replicate-17: /6002/music/6/79e4c43fbff5​41d0bef6043e812dfa83.mp3: gfid differs on subvolume 1 (a68a619e-d769-4ed5-a3d2-f27e28ad7d2b, 04c064d6-08b2-40e6-9013-dd639f29f456)
13:58 zhuyb [2012-12-25 10:34:54.102482] E [afr-self-heal-common.c:133​3:afr_sh_common_lookup_cbk] 24-jss-r2-replicate-17: Conflicting entries for /6002/music/6/79e4c43fbff541d0bef6043e812dfa83.mp3
13:59 zhuyb version is 3.2.5 , does somebody know how to deal with this ?
14:01 maxiepax joined #gluster
14:22 stopbit joined #gluster
14:53 mohankumar joined #gluster
15:00 Lejmr joined #gluster
15:01 cicero zhuyb: i'm not sure, but maybe copy the files directly off the bricks and then compare to see which one is correct? and then copy that back to the gluster mountpoint
15:02 zhuyb is there a better way ?
15:02 cicero probably :P
15:03 cicero did you upgrade recently?
15:03 zhuyb I do a rebalance operation on my cluster last week
15:03 raven-np joined #gluster
15:04 cicero interesting: http://community.gluster.org/a/alert-​glusterfs-release-for-gfid-mismatch/
15:04 glusterbot <http://goo.gl/uoyTN> (at community.gluster.org)
15:04 cicero but that was pre-3.2.3
15:04 zhuyb before the rebalance op finished. one node reboot for some reason .
15:05 Lejmr joined #gluster
15:06 zhuyb hi, cicero , is there a document show the process of  migrate data?
15:07 zhuyb I want to re-produce the case in test enviroment
15:07 zhuyb but failed.
15:09 zhuyb I want to just remove the file that cause the problem and wait gluster self-heal the problem.
15:09 cicero :\ not sure
15:10 zhuyb is there a superman could make sure how to deal with this.
15:10 cicero T_T
15:10 cicero probably but i imagine they're on holiday or something
15:10 zhuyb I'v try to fix this for 48 hrs. tired.
15:11 cicero that article i linked to is interesting though
15:11 cicero because it generates the gfid separately
15:12 cicero dunno, that might be a way to sync it
15:12 zhuyb thanks.
15:20 cicero glfh
15:20 cicero glhf
15:22 wushudoin joined #gluster
15:22 isomorphic joined #gluster
15:23 zhuyb PREVENTION: To *prevent* the issue, please install GlusterFS 3.2.3 If you're using 3.1.x, upgrade to 3.1.7
15:24 zhuyb my version is 3.2.5 , >= 3.2.3
15:24 zhuyb > 3.2.3
15:25 semiosis :O
15:41 atrius_away joined #gluster
15:48 guigui3 joined #gluster
16:14 raven-np1 joined #gluster
16:30 __Bryan__ joined #gluster
16:34 raven-np joined #gluster
16:56 edward1 joined #gluster
16:59 zhuyb joined #gluster
17:01 Humble joined #gluster
17:03 ekuric1 joined #gluster
17:14 duerF joined #gluster
17:25 peterlin joined #gluster
17:25 Mo__ joined #gluster
17:25 Mo___ joined #gluster
17:30 peterlin I'm wondering about nfs.mem-factor. It says that increasing the value will make nfs faster, and to contact the mailing list first. I have a situation where the nfs server seems to be a little overloaded, especially after a brick outage (and subsequent healing needs to be done). Could changing the setting help me, and if so, does anybody understand what exactly it does?
18:51 ekuric1 left #gluster
19:01 eightyeight hmm. 'gluster peer status' shows my other node is connected, but 'gluster volume create ....' fails, saying it's not connected. what do i need to do to create the new volume?
19:18 nueces joined #gluster
19:26 JoeJulian Hello all. Hope everyone had a merry Christmas.
19:26 eightyeight well, i live migrated all vms to one node, unmounted the gluster client mount, stopped glusterd, removed /etc/glusterd/peers/<abcdef>/, restarted gluster, re-peered, re-mounted the client mount, and _still_ cannot create the new volume
19:27 eightyeight the peer is connected, but creating the volume says it's not
19:27 * eightyeight is stumped
19:27 JoeJulian restart the /other/ glusterd?
19:29 eightyeight yeah. re-did the steps above for the other node
19:30 eightyeight on both nodes: 1. migrate, 2. umount, 3. stop glusterd, 4. rm peer uuid file, 5. start glusterd, 6. peer probe, 7. mount, 8. create volume
19:31 eightyeight looking at logs now
19:31 JoeJulian What's the exact error message so I can look in the code and see what the possibilities are? Also, fpaste the logs.
19:32 xinkeT joined #gluster
19:33 eightyeight ok. mind if we go to pm?
19:36 JoeJulian Kind-of... I'd prefer that everyone is able to learn from this debugging, but for log links I don't mind.
19:37 badone joined #gluster
19:37 eightyeight ok. in that case, i'm masking the fqdn of the nodes
19:37 JoeJulian sure
19:37 eightyeight here's a pastebin of the output of both nodes: http://ae7.st/p/31p
19:37 glusterbot Title: Pastebin on ae7.st » 31p (at ae7.st)
19:38 * eightyeight prepares a pastebin of the logs
19:39 eightyeight i don't know how much of the logs you want. assuming /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
19:40 eightyeight i have rdma errors in there, that i could probably weed out
19:40 JoeJulian That's my assumption as well.
19:40 JoeJulian Hrm... your peer state is wrong. It should be State: Peer in Cluster (Connected)
19:41 eightyeight there is a lot of:
19:41 eightyeight [2012-12-26 12:39:37.452220] E [socket.c:2080:socket_connect] 0-management: connection attempt failed (Connection refused)
19:41 eightyeight this is v 3.2.7 from debian testing
19:42 SunCrushr joined #gluster
19:42 SunCrushr Hi everybody!
19:42 SunCrushr I've got a quick question for you.
19:44 SunCrushr Say I have two servers nodes setup with glusterfs, each providing one brick, and setup for replication in order to give high availability.
19:45 eightyeight JoeJulian: here's the necessary logs, on both servers, weeding out some of the (i think) unnecessary logs: http://ae7.st/p/767
19:45 glusterbot Title: Pastebin on ae7.st » 767 (at ae7.st)
19:45 SunCrushr Connecting to these servers are some nodes that will be running VMs and storing the VMs on the replicated servers.  If these VM nodes use the gluster client, how is the replication done?  Does the node do one write to one server, and then the servers replicate to each other, or does the client have to do a write to both servers?
19:46 JoeJulian brb in 5...
19:49 SunCrushr I'm trying to make up my mind on how to setup the topology here to squeeze the best performance out of the storage network we possibly can.  I'll be using RDMA on an infiniband network.
19:49 elyograg SunCrushr: From everything I've heard here, the client will write to both servers.  If one is down during the write, self-heal will fix it.  self-heal should be automatic with 3.3 and later.
19:49 SunCrushr Each of the storage nodes is going to have two dual port infiniband cards, and each client (virtualization node) will have two single port cards.  There will be two switches for redundancy as well.
19:50 SunCrushr I was originally going to configure the servers with two of the ports going to the infiniband switch, and two of them going directly to the other storage node.
19:50 * eightyeight gets lunch
19:51 SunCrushr I was then going to have all replication take place via the direct connection.  Sounds like if I use native gluster on the clients, that won't be the case.
19:52 SunCrushr Also, sounds like using native gluster on the clients will double the bandwidth for each write.
19:53 SunCrushr So, I'm left wondering if I should setup glusterfs over RDMA on the link between the nodes, and then just use NFS over RDMA with some other mechanism for HA over the main storage network client to server.
19:54 SunCrushr If it helps to know, there will never be more storage server nodes in this location, as each server will be upgradable to up to 2 PB of storage, so I'll not need to add another node.
20:08 JoeJulian eightyeight: try running glusterd with log-level=DEBUG I see some debug messages that look like they might shed some light on why you're in GD_FRIEND_EVENT_LOCAL_ACC instead of GD_FRIEND_EVENT_CONNECTED.
20:10 JoeJulian SunCrushr: Sounds to me like if maximum bandwidth is your goal, with the configuration you describe moving the dual ports to the clients and the single ports to the servers will maximize the use of your hardware using the native mount.
20:11 eightyeight i redid the steps above, this time using the hostname upto the first '.' in the fqdn, and i have "Peer in Cluster (Connected)"
20:11 eightyeight both the hostname and fqdn are in dns, and defined in /etc/hosts, with the proper search in /etc/resolv.conf
20:11 eightyeight *shrug*
20:11 Guest25446 joined #gluster
20:13 eightyeight well, on one node at least. the other node, not so lucky
20:13 SunCrushr Once I add a lot of clients, won't that configuration start to create a bottleneck at the servers?
20:13 eightyeight there we go
20:14 SunCrushr In any other many-to-few instance, I'd think you'd want more ports on the servers.
20:15 eightyeight ... and i'm good
20:17 passie joined #gluster
20:17 SunCrushr What do you think of the option of using NFS with UCARP for client access with HA, and just doing the replication via a second direct network link between the two fileservers?
20:19 SunCrushr This would make them act as one, without creating the need for double the write IOs from each client.
20:19 JoeJulian SunCrushr: I guess my opinion would depend on the end goal.
20:21 SunCrushr End goal is to have these two fileservers act as one (fully replicated) for the storage of VMs for multiple Proxmox VE based hypervisor servers, so most of the storage is VM images.
20:21 SunCrushr A secondary but not necessary goal is to load balance the read IOs between the fileservers.
20:30 passie left #gluster
20:30 JoeJulian Just read up on Proxmox which looks, essentially, like a rebranded RHEL. Since you're using qemu-kvm, you should bypass the clients all together and use the library interface with GlusterFS 3.4.
20:31 JoeJulian As far as bandwidth is concerned, if it was me, I'd throw hardware at that problem as it becomes saturated.
20:32 JoeJulian (well, actually as the hardware saturation average exceeds the 80th percentile)
20:34 SunCrushr Actual PVE is bult on Debian, not RHEL.
20:34 SunCrushr it basically makes for a nice replacement for things like ESXi.
20:35 SunCrushr Can you give me a link to documentation on the library interface?
20:36 SunCrushr Also, wouldn't I have to wait for 3.4 to release?
20:36 JoeJulian http://www.gluster.org/2012/​11/integration-with-kvmqemu/
20:36 glusterbot <http://goo.gl/IhqoH> (at www.gluster.org)
20:36 JoeJulian For that feature, yes.
20:39 SunCrushr Thanks for the info.
20:40 SunCrushr As far as bandwidth is concerned, I'm not too worried.  This will be an Infiniband QDR network to start with, so it should perform well right off the bat, no matter which topology and client access protocol I go with.  I'm just trying to think ahead.  Thanks for all the info.
20:41 JoeJulian You're welcome
20:49 kaos01 joined #gluster
20:55 mohankumar joined #gluster
20:55 manik joined #gluster
21:01 SunCrushr Sorry, one more question.  After considering what you've said, I'm leaning towards using the native gluster client on my client machines.  I know that not too long ago, one of the issues with using Gluster for HA was that large files (like VM images) would have to be locked in the event of healing, rendering the VMs offline during that process.  Is that no longer an issue?
21:02 JoeJulian No longer an issue since 3.3.0.
21:04 SunCrushr Thanks!
21:04 kaos01 hi, maybe stupid question but say two nodes get a file at teh same time, what happens ?
21:04 semiosis ,,(glossary)
21:04 glusterbot A "server" hosts "bricks" (ie. server1:/foo) which belong to a "volume"  which is accessed from a "client"  . The "master" geosynchronizes a "volume" to a "slave" (ie. remote1:/data/foo).
21:05 semiosis kaos01: could you clarify your question please?  node is ambiguous...
21:06 kaos01 i probably dont know how glusterfs works is there a server node which "syncs" files to client nodes ?
21:06 kaos01 or all nodes sync amongst each other
21:07 semiosis the files live in bricks on the servers, clients access them over the network
21:07 JoeJulian A node is an endpoint, in this vernacular that's typically any device that's attached to a network.
21:07 semiosis if you're using a replicated volume then the replica bricks are kept in sync
21:20 SunCrushr Another question about replication.  Say I've got two glusterfs servers which each have one brick, replicated, and one client machine.  If one of the servers goes offline, and then comes back online later, obviously files need to be healed.  What controls this healing process, the client, or the servers?  Also, will this healing process generate a lot of extra IOs on the client?
21:25 semiosis https://github.com/blog/13​64-downtime-last-saturday
21:25 glusterbot <http://goo.gl/v5Pfx> (at github.com)
21:25 semiosis sounds like github could benefit from switching to glusterfs using replication & quorum instead of their ad-hoc & complicated drbd solution
21:25 semiosis johnmark, go!
21:25 semiosis :)
21:27 semiosis SunCrushr: there's a self-heal daemon running on the servers which should manage the healing process... i'm not too familiar with it though, in the pre-3.3.0 days healing would be done by the client
21:29 SunCrushr But you're relatively certain that the client doesn't have to do the healing anymore?  You can understand my concern here, as the client needs to provide CPU and RAM resources to VMs in my case.
21:32 SunCrushr Also, if I do end up using NFS for client access, how does Gluster handing the replication.  Obviously the NFS writes will go to one server, so does Gluster just automatically replicate each write over the network to the other gluster server?
21:32 semiosis you really should try every failure scenario you can think of and make sure things go how you expect before going to prod.
21:32 SunCrushr (handing should be handle)
21:34 semiosis the gluster nfs server is kinda like a native client mount re-exported over nfs, except that fuse is not involved, and the nfs server is built in to glusterfs
21:34 semiosis but conceptually its very much like using a native client on the nfs server
21:34 semiosis so while the client transacts with only one server, that server routes to all bricks
21:34 semiosis as needed
21:35 SunCrushr OK, and that routing could be on a separate network if so desired?
21:36 SunCrushr So I could have separate interfaces on my gluster servers (2 servers) that are directly connected for replication?
21:36 semiosis well yes i suppose you could
21:37 SunCrushr And then I could use round robin DNS to spread the NFS access between the two gluster servers?
21:38 semiosis again, i suppose you could
21:38 SunCrushr I'm trying to make this so that if the switches go down replication still functions between the two servers.
21:38 semiosis but dont take that to mean i recommend it :)
21:39 SunCrushr Of course, the tradeoff is that every bit of IO done over NFS in turn causes replication IO to occur on the other network interfaces of the servers, so the servers have to handle more.
21:50 semiosis SunCrushr: imho there's only two cases which justify using an NFS client... 1) absolutely can't install the fuse client (non-linux, administrative policy, etc...), and 2) wan clients
21:50 semiosis otherwise imho fuse client is the way to go
21:50 semiosis more reliable, more scalable
21:55 SunCrushr And faster with large files from what I've heard.
22:25 JoeJulian It's "faster" will all files, just doesn't cache stats.
22:31 carlosmp joined #gluster
22:33 carlosmp Hi - I'm trying to find out if gluster is a viable storage backend fo ra for a HyperV solution?  HyperV 2012 supports NFS for vhd, but only if SMBv3, or so I've been told...
22:36 JoeJulian NFS != SMB so one of us is confused.
22:37 carlosmp Yes, when they mentioned that I was a bit confused myself...
22:37 JoeJulian Maybe they're the ones that are confused.
22:37 carlosmp This is what they told me: Windows Server 2012 Hyper-V checks that the remote end is running SMB v3 for VHDs located on a remote share. This means that you'll be able to run VHDs from a NetApp (and others) once they provide SMB v3 capability.
22:40 JoeJulian Hmm, though old, http://blog.thestoragearchitect.com/2011/06/0​2/why-does-microsoft-hyper-v-not-support-nfs/ does seem to come to the same conclusion.
22:40 glusterbot <http://goo.gl/Ce6QZ> (at blog.thestoragearchitect.com)
22:42 JoeJulian Looks like if you want to use GlusterFS you'll have to mount the volume and re-share it with samba 3
22:42 JoeJulian Sorry it took so long to type that. I had to keep deleting my anti-microsoft slant.
22:42 carlosmp Wondering how remount with samba will affect performance
22:43 JoeJulian Compared to not being able to make that work at all any other way, it should perform better.
22:44 kaos01 trying to create cluster volume but get: host not connected, gluster peer status says its connected
22:45 JoeJulian carlosmp: Normally, I would suggest you file a bug report (not the link glusterbot's about to give you) with the hypervisor vendor.
22:45 glusterbot http://goo.gl/UUuCq
22:45 JoeJulian ~pastestatus | kaos01
22:45 glusterbot kaos01: Please paste the output of "gluster peer status" to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
22:45 JoeJulian kaos01: ... from more than one server
22:46 JoeJulian @forget pastestatus
22:46 glusterbot JoeJulian: The operation succeeded.
22:46 JoeJulian @learn pastestatus as Please paste the output of "gluster peer status" from more than one server to http://fpaste.org or http://dpaste.org then paste the link that's generated here.
22:46 glusterbot JoeJulian: The operation succeeded.
22:47 carlosmp @JoeJulian - thanks for the info?  trying to see our options for building out our vm servers and gluster seems like the est option for us.  if that means moving to vmware then so be it...
22:47 JoeJulian +1
22:47 JoeJulian ... or kvm or xen
22:48 layer3switch joined #gluster
22:48 kaos01 JoeJulian:    http://fpaste.org/MGKZ/
22:48 glusterbot Title: Viewing Paste #262953 (at fpaste.org)
22:49 JoeJulian kaos01: What version is this? This same issue was happening to eightyeight earlier today. He worked around it by using short names instead of fqdn.
22:49 kaos01 ok i try that :)
22:50 kaos01 glusterfs-3.2.7-1.el6.x86_64
22:50 JoeJulian kaos01: ... and, of course, you didn't probe the first server by hostname so that's why it's showing an ip address.
22:50 JoeJulian @yum repo
22:50 glusterbot JoeJulian: kkeithley's fedorapeople.org yum repository has 32- and 64-bit glusterfs 3.3 packages for RHEL/Fedora/Centos distributions: http://goo.gl/EyoCw
22:50 kaos01 JoeJulian im pritty sure i used hostname :)
22:51 carlosmp @JoeJulian - yes been using proxmox for kvm/containers and working well.
22:51 JoeJulian You probed one server by hostname, but to set the hostname of that first server you probed with, you have to probe it from another server.
22:52 kaos01 ok
22:52 raven-np joined #gluster
22:52 JoeJulian At least with the open-source solution, you can actually get bugs addressed if you're a smaller customer than, for instance, Boeing.
22:53 JoeJulian That's not going to happen with my neighbors over here at MS.
22:55 JoeJulian That's not just speculation either. I know several testers that have confirmed that over the years. It's only the customers with the large purchasing power that gets hotfixes into MS products.
23:12 kaos01 JoeJulian, firewall :(

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary