Camelia, the Perl 6 bug

IRC log for #gluster, 2012-10-30

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:12 H__ joined #gluster
00:13 nodots1 left #gluster
00:40 crashmag joined #gluster
01:01 Fabiom joined #gluster
01:02 Fabiom distributed replication - Added two more serves one brick on each. gluster volume add-brick [volume] [server3/brick] [server4/brick]. I did a volume rebalance. I see server3 and server4 filling up with data. I wanted to expand the volume by the extra bricks. eg 100G to 200G of usable space. Can someone explain to me what is going on. Did I do something wrong ?
01:17 johnmark z00dax: I used to, then moved out here over the summer
01:27 Daxxial_ joined #gluster
01:57 dmachi1 joined #gluster
02:12 seanh-ansca joined #gluster
02:15 jiffe1 did the office hours happen last week?
02:25 chacken2 joined #gluster
02:31 bharata joined #gluster
02:33 Nr18 joined #gluster
02:33 ika2810 joined #gluster
02:45 chacken joined #gluster
02:47 sgowda joined #gluster
03:30 shylesh joined #gluster
03:31 UnixDev joined #gluster
03:42 sripathi joined #gluster
03:44 saz joined #gluster
04:03 blendedbychris joined #gluster
04:03 blendedbychris joined #gluster
04:23 vpshastry joined #gluster
04:26 ngoswami joined #gluster
04:38 koodough joined #gluster
05:04 vikumar joined #gluster
05:05 hagarth joined #gluster
05:16 faizan joined #gluster
05:23 bharata_ joined #gluster
05:29 koodough joined #gluster
05:36 koodough1 joined #gluster
05:38 lng joined #gluster
05:39 lng Hi! I need to upgrade Gluster from 3.3.0 to 3.3.1 on Ubuntu, but 3.3.0 was installed from deb package. How can I do it?
05:39 lng but 3.3.1 is PPA
05:50 shireesh joined #gluster
05:51 raghu joined #gluster
05:52 bala joined #gluster
05:54 sgowda joined #gluster
05:59 mdarade1 joined #gluster
06:09 Venkat joined #gluster
06:11 deepakcs joined #gluster
06:23 koaps_ left #gluster
06:30 ramkrsna joined #gluster
06:30 ramkrsna joined #gluster
06:40 rgustafs joined #gluster
06:50 sripathi joined #gluster
07:00 UnixDev joined #gluster
07:03 ngoswami joined #gluster
07:04 bala joined #gluster
07:07 lkoranda joined #gluster
07:10 vimal joined #gluster
07:21 puebele joined #gluster
07:22 shireesh joined #gluster
07:24 Technicool joined #gluster
07:39 puebele joined #gluster
07:52 ika2810 joined #gluster
07:57 tjikkun_work joined #gluster
08:02 ctria joined #gluster
08:03 samkottler joined #gluster
08:11 ngoswami joined #gluster
08:36 ctria joined #gluster
08:43 Nr18 joined #gluster
08:43 morse joined #gluster
08:48 manik joined #gluster
08:56 dobber joined #gluster
08:58 gbrand_ joined #gluster
08:59 Venkat joined #gluster
09:04 TheHaven joined #gluster
09:13 clag_ joined #gluster
09:19 ctria joined #gluster
09:32 21WAAMJYW joined #gluster
09:36 mdarade1 joined #gluster
09:40 DaveS_ joined #gluster
09:45 kevein joined #gluster
09:47 hagarth joined #gluster
09:55 vpshastry joined #gluster
09:57 pkoro joined #gluster
09:58 tryggvil joined #gluster
10:11 Tarok joined #gluster
10:13 rgustafs joined #gluster
10:32 ika2810 joined #gluster
11:04 jays joined #gluster
11:25 kkeithley joined #gluster
11:32 mohankumar joined #gluster
11:33 vpshastry1 joined #gluster
11:36 ika2810 left #gluster
11:37 samu60 joined #gluster
11:38 samu60 is there anyone around?
11:38 mdarade1 joined #gluster
11:38 mdarade1 joined #gluster
11:39 samu60 I've got a question about a gluster environment using IP over Infiniband
11:40 samu60 since transport rdma is not usable in 3.3.1
11:40 samu60 the problem is that the transfer ratio is quite acceptable until flush-8:16 process blocks the system
11:40 samu60 I guess that copying dirty pages to disk
11:41 samu60 at this moment the transfer drops to ridiculous figures
11:41 samu60 and the io rises up to 99% for all processes, both glusterfs and flush
11:41 samu60 "frozing" the system
11:41 samu60 is there any parameter controlling this behaviour?
11:43 VeggieMeat joined #gluster
11:49 VeggieMeat joined #gluster
11:54 bala joined #gluster
11:56 Tarok joined #gluster
12:02 mohankumar joined #gluster
12:02 vpshastry1 joined #gluster
12:11 ndevos samu60: flush-8:16 comes from the block-layer and affects device 8:16, normally /dev/sdb
12:11 samu60 it is indeed /dev/sdb
12:12 ndevos samu60: what filesystem are you using?
12:12 samu60 XFS
12:12 Triade joined #gluster
12:12 samu60 with ext4 both flush and jdb2 processs rised up to 99%
12:12 ndevos any specific mount options?
12:12 samu60 not
12:12 samu60 defaults
12:12 samu60 just modified with
12:13 samu60 blockdev --setra 4096 /dev/sdb
12:13 ndevos and what kind of volumes do you have, just replicate/distribute, or also stripe?
12:13 samu60 and reduced dirty related options in /proc
12:13 samu60 Type: Distribute
12:13 samu60 just one node
12:14 samu60 system is on Centos 5.3
12:14 samu60 Centos 6.3, sorry
12:14 ndevos samu60: what kind of storage is /dev/sdb? i've not heard flush to block completely on normal hardware
12:15 ndevos I guess it can happen on iscsi and the like, if network is lagging....
12:15 samu60 HP SATA 7.2K
12:16 samu60 I'm using Infiniband with TCP transport (IPoIB)
12:16 ndevos samu60: is there a lot being written at the same time?
12:16 samu60 not really
12:16 samu60 only the gluster process
12:16 samu60 using iotop only shows gluster and flush processes
12:16 ndevos well, sure, but are there any clients writing to the volume?
12:16 samu60 and write values drops to 0%
12:16 samu60 everything is stopped
12:16 samu60 only one client
12:17 samu60 due to hardware and budget limitations we had to setup 2 clients
12:17 samu60 and one of them is also the gluster server
12:18 samu60 so I don't know wheter the client and server gluster on the same machine is causing some weird issues
12:18 ndevos samu60: and both clients write to a mounted volume, not to the brick directly?
12:18 samu60 right
12:18 samu60 to the fuse client mounted volume
12:18 ndevos okay, should be find
12:19 samu60 I expect some low figures due to this limitation but it's ok since it's currently a test scenario
12:19 ndevos so the question is, why flush would block on writing to the disk...
12:20 samu60 my question is whether Infiniband or other parameter could cause this extrange behaviour
12:20 * ndevos does not think so
12:20 samu60 up to, aproximately 1G of data being written everything is ok
12:20 samu60 but after X G of data, flush starts "eating IO waiting"
12:20 samu60 and the system frozes
12:21 samu60 it's probably a disk (hardware issue)?
12:21 ndevos samu60: do you have any hp-monitoring tools installed? I think some of them trigger some smart checking on the disk too
12:21 samu60 more than of the infiniband and shared client/server scenario?
12:21 samu60 I don't think so
12:22 samu60 I'll dig into hardware checking tools and come back if the disk is doing fine
12:22 ndevos samu60: and, maybe try mounting with 'sync', any changes should be flushed immediately that way, maybe it helps
12:23 samu60 as an extra thing...on the server there is a disk (sda) with ext4 and another disk (sdb) with XFS and doing local dd
12:23 samu60 also shows extrange values
12:23 samu60 ok
12:23 ndevos it wont be a real fix, but might give some ideas on what to troubleshoot
12:23 samu60 i'll try mounting with sync
12:24 samu60 on ext4 disk:
12:24 samu60 dd if=/dev/zero of=lavoy2asdf bs=1G count=1
12:24 samu60 (1.1 GB) copied, 1.61541 s, 665 MB/s
12:24 samu60 dd if=/dev/zero of=lav3 bs=5G count=1
12:24 samu60 99.99 % [jbd2/dm-0-8]
12:24 samu60 31.09 M/s  0.00 % 95.60 % dd if=/de~5
12:25 samu60 (2.1 GB) copied, 30.8726 s, 69.6 MB/s
12:25 samu60 so only 2G are copied and the transfer is 70MB/s
12:25 samu60 on XFS disk (sdb):
12:25 samu60 89.15 % [flush-8:
12:25 samu60 dd if=/dev/zero of=lavoy2asdf bs=1G count=1
12:26 samu60 (1.1 GB) copied, 8.01541 s, 134 MB/s
12:26 samu60 dd if=/dev/zero of=lavoy2asdf bs=5G count=1
12:26 samu60 31.23 M/s  0.00 % 95.13 % dd if=/de~5G co
12:27 samu60 (2.1 GB) copied, 40.8922 s, 52.5 MB/s
12:27 samu60 so it does not copy the full 5G , only 2
12:27 samu60 and the transfer drops a lot
12:27 samu60 so it's more a thing about the disk or the system (both on ext4 and xfs happens the same)
12:27 samu60 ane hint?
12:28 * ndevos thinks thats awkward, and doesn't have an idea what that could be
12:30 samu60 thanks a lot
12:30 samu60 we'll make some hardware tests
12:31 samu60 and see whether there's a problem on the disks, the RAID controller, or other parameter on Centos 6.3
12:31 samu60 thanks a lot!
12:33 Tarok joined #gluster
12:37 andreask joined #gluster
12:42 pkoro joined #gluster
12:42 balunasj joined #gluster
12:48 Venkat joined #gluster
13:01 kkeithley1 joined #gluster
13:01 tmirks joined #gluster
13:03 Tarok joined #gluster
13:06 bala joined #gluster
13:08 lkoranda joined #gluster
13:11 Triade joined #gluster
13:11 samu60 just in case somewhere is looking at my previous posts, it was the RAID controller
13:11 samu60 changing the disk to a new server it performs ok
13:14 Tarok joined #gluster
13:14 orogor joined #gluster
13:14 orogor hi
13:14 glusterbot orogor: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
13:15 orogor left #gluster
13:15 orogor joined #gluster
13:16 orogor looking for some ideas for filesharing wich would provide access to metadata even if the host if offline, would glusterfs do this?
13:17 nodots joined #gluster
13:17 ndevos orogor: glusterfs doesnt have seperated metadata, the only bits it has are in the extended attributes of the files and dirs
13:18 orogor any idea how to do this ?
13:18 orogor via another project
13:19 Tarok joined #gluster
13:21 johnmark orogor: you could probably create some type of script that would fetch metadata and shove it in a DB
13:22 ndevos orogor: you could save your metadata on a glusterfs volume to make it high-available :)
13:22 johnmark and use the marker API to only deal wiht metadata that has changed
13:23 johnmark orogor: but yeah, I'm with ndevos. the only way this is ever useful is if all of your server nodes are down
13:23 orogor making a script , i thought about it , then also thought maybe somone already does this kind of stuff
13:23 johnmark orogor: this sort of thing will be much easier with the next release
13:24 ndevos orogor: what metadata are you concerned for?
13:24 johnmark which, if I'm not mistaken, will include an exposed marker API, so that it's easier to write scripts that utilize it
13:26 * ndevos wonders why someone cares about metadata being unavailable, if that meants the main-data itself is unavailable too
13:26 orogor any and all basically provide access to data, but dont  care when i get it, so at least provide access to metadata and let the copy be done when the node will be availlable again, tape inserted or drive fixed
13:27 ndevos right, that is definitely out of scope for glusterfs, but you can use glusterfs to have N copies of that metadata on different hosts
13:27 orogor metadata provide a list of data that might be available, can still allow to do filename check and filesize check as well as provide additional info like a checksum
13:28 ndevos or use a DB with a number of replicating slaves
13:28 ndevos orogor: sounds like CXFS from SGI, it has some HSM features
13:30 hagarth joined #gluster
13:31 edward2 joined #gluster
13:34 guigui3 joined #gluster
13:40 shireesh joined #gluster
13:40 bulde1 joined #gluster
13:43 orogor left #gluster
13:47 overclk joined #gluster
13:49 tryggvil_ joined #gluster
13:55 Tarok left #gluster
13:57 dmachi joined #gluster
14:02 manik joined #gluster
14:02 samkottler joined #gluster
14:06 tryggvil joined #gluster
14:07 chouchins joined #gluster
14:11 Triade joined #gluster
14:15 overclk joined #gluster
14:17 sunus joined #gluster
14:18 Melsom hi! we're serving iscsi-volumes from our two bricks, and we are experiencing the two bricks never being completely in sync.
14:18 Melsom The large iscsi-files are always 5-10MB out of sync from one another.
14:18 Melsom anyone having experiences with large files and iscsi with glusterfs?
14:18 sensei You're serving from the bricks?
14:19 Melsom we're using 3.3.1
14:19 nodots left #gluster
14:20 Melsom yes, both nodes are mounted on them selves, though the iscsi-service is only running on one of them, to avoid vmware creating any kind of corruption multipathing.
14:20 Melsom http://pastebin.com/9vqhm3Nw
14:20 glusterbot Please use http://fpaste.org or http://dpaste.org . pb has too many ads. Say @paste in channel for info about paste utils.
14:21 jbrooks joined #gluster
14:21 Melsom here is our current configuration.
14:21 Melsom the two bricks is connected through 3x1gbit bond.
14:22 Melsom http://fpaste.org/kHTP/
14:22 glusterbot Title: Viewing Paste #247795 (at fpaste.org)
14:24 Melsom gluster-client is having high loads on both servers.
14:25 wushudoin joined #gluster
14:30 Melsom sensei: any clues?
14:30 ndevos Melsom: you have a glusterfs volume that you use to store big files and you access these files through iscsi?
14:30 ndevos if so, that does not really sould like a good glusterfs use-case to me...
14:36 m0zes sounds like a better use-case for drbd... :/
14:42 daMaestro joined #gluster
14:48 abyss^ hi, i've already installed gluster 3.3 and I would like to upgrade to ver 3.3.1 on debian system. Download glusterfs-server_3.3.1-ubuntu1~precise1_amd64.deb package , stop gluter server, dpkg -i glusterfs-server_3.3.1-ubuntu1~precise1_amd64.deb, start gluster will be enough? (I use ubuntu packages because no earlier no package for debian 6.0?
14:54 hagarth joined #gluster
14:54 Melsom m0zes: we've tried drbd, but it doesn't work very well with two nodes. Both nodes thinking the other is down and rebooting.
14:55 Melsom DRBD with two nodes really need both to be online for everything to work, and filesystems open.
14:55 sjoeboo joined #gluster
14:56 Melsom ndevos: it really came down to either glusterfs or drbd.. gluster seemed the best after testing drbd.
14:56 sunus joined #gluster
14:58 glusterbot New news from resolvedglusterbugs: [Bug 802036] readlink fails on CIFS mount <https://bugzilla.redhat.com/show_bug.cgi?id=802036>
15:04 JoeJulian Melsom: You're not writing directly to the bricks, but rather through the fuse mount, right?
15:06 sjoeboo joined #gluster
15:07 sjoeboo joined #gluster
15:08 JoeJulian abyss^: stop glusterd and glusterfsd, upgrade, start glusterd. While glusterfsd is stopped, the brick(s) on that server will be unavailable.
15:13 abyss^ JoeJulian: upgrade = dpkg -i package, yes? Sorry I usually using centos;)
15:13 abyss^ *use
15:13 semiosis abyss^: you will need all three packages, glusterfs-server, glusterfs-common, glusterfs-client
15:13 semiosis @latest
15:13 glusterbot semiosis: The latest version is available at http://download.gluster.org/p​ub/gluster/glusterfs/LATEST/ . There is a .repo file for yum or see @ppa for ubuntu.
15:14 semiosis abyss^: why do you not use the debian packages here: http://download.gluster.org/pub/​gluster/glusterfs/LATEST/Debian/
15:14 glusterbot Title: Index of /pub/gluster/glusterfs/LATEST/Debian (at download.gluster.org)
15:14 semiosis ???
15:17 sr71 joined #gluster
15:20 abyss^ semiosis: because I used glusterfs_3.3.0-1_amd64.deb (for ubuntu) and I don't wanna problems;)
15:22 semiosis abyss^: that package did not come from my PPA
15:22 semiosis you may as well switch to the debian package for 3.3.1 at the ,,(latest) link
15:22 glusterbot The latest version is available at http://download.gluster.org/p​ub/gluster/glusterfs/LATEST/ . There is a .repo file for yum or see @ppa for ubuntu.
15:22 abyss^ yes this packages was here: http://download.gluster.org/pub/gl​uster/glusterfs/3.3/3.3.0/Ubuntu/
15:22 glusterbot Title: Index of /pub/gluster/glusterfs/3.3/3.3.0/Ubuntu (at download.gluster.org)
15:23 semiosis so, don't switch to my ppa, switch to the new debian package
15:23 davdunc joined #gluster
15:23 davdunc joined #gluster
15:24 abyss^ semiosis: so I should download thath 3 packages and dpkg -i for each? Or I should find glusterfs_3.3.1-x_amd64.deb package?
15:25 rubbs I've been reading this: http://joejulian.name/blog/sho​uld-i-use-stripe-on-glusterfs/  And I'm curious if people would consider VM Image files as "large files with random i/o." I ask because I'm about to start playing with Gluster today or tomorrow on some new hardware and wanted some input as to whether or not striping it was a good idea.
15:25 glusterbot Title: Should I use Stripe on GlusterFS? (at joejulian.name)
15:29 semiosis abyss^: i would recommend making a backup of your /var/lib/glusterd directory.  remove existing 3.3.0-1 package, install new debian packages for 3.3.1, and then if needed restore the backup of /var/lib/glusterd & reboot
15:30 abyss^ semiosis: ok, thank you for advice
15:30 semiosis yw, good luck
15:32 UnixDev joined #gluster
15:36 duerF joined #gluster
15:40 JoeJulian rubbs: Depends on how you use it. Since performance has been unacceptable for some use cases with vm images regardless of the volume type, many that have reported success just use smaller images and use a fuse mounted volume for their data within the vm.
15:43 rubbs hrm... transfering to a fuse mounted volume isn't much of an option in this case. I am the only SysAdmin here, and I need to get some systems migrated off old hardware. My plan is virtualize first, then work on changing the way the systems work themselves.
15:44 rubbs If I had unlimited time/money/people under my belt, I'd love to do that. I just can't take on too much at once without potentially screwing something up beyond my ability to fix.
15:44 JoeJulian Since vm images are single-thread one client, I don't think stripe will gain you anything.
15:44 rubbs In this case I have time enough to try both and see what kind of performance differences I see.
15:45 rubbs JoeJulian: that was my initial thought too. re: single process per VM
15:45 JoeJulian rubbs: are you doing kvm?
15:46 rubbs yeah.
15:46 rubbs JoeJulian: sorry forgot to hilight you above ^
15:46 JoeJulian upstream has added direct volume support to qemu making it significantly faster.
15:47 rubbs Oh!
15:47 * rubbs looks around.
15:47 JoeJulian I'm trying to decide, myself, if I'm willing to wait for it to filter down into rpms, or if I'm going to be impatient.
15:48 rubbs well, I have some time to test things here, but I'm going to have to put this in production soonish.
15:48 shireesh joined #gluster
15:49 JoeJulian I'm going to wait. We're entering November and my quiet period. No major changes until March.
15:49 guigui3 joined #gluster
15:50 rubbs that must be nice. I'm about to hit some big stuff soon.
15:51 JoeJulian I hate it. I'm usually chomping at the bit waiting for the taxes to be filed so I can get back to work improving things.
15:53 JoeJulian So, semiosis, how'd you fare the weather?
15:54 semiosis we saw mild showers from ts sandy last week and this weekend we've had the best weather all year
15:55 semiosis nothing compared to what the northeast is getting tho
15:55 semiosis so far so good in ec2 n. va
15:55 semiosis it's only a cat 1 hurricane, commercial buildings should stand up to a cat 1 without any issue
15:55 semiosis street signs & shoddy residential structures may get knocked down tho
15:57 semiosis brb
15:58 johnmark semiosis: yes, assuming they weren't dumb enough to build the data center at sea level next to the water
15:58 johnmark it should be fine :)
16:03 neofob joined #gluster
16:06 Melsom JoeJulian: No, we're writing to the client mounts. And as I mentioned, only one iscsi-target is running at any given time, so vmware won't corrupt anything on its own with multipathing.
16:06 Melsom Usually 5MB off on our at the moment 2TB iscsi lun.
16:08 lh joined #gluster
16:18 overclk joined #gluster
16:25 JoeJulian johnmark: You assume they built the datacenter. Sometimes these things just spring up. Look at the hub in Seattle. It's one of the major distribution hubs but it started out as a telco exchange in 1981. The whole datacenter thing just sort-of happened and there's always been struggles to get enough power in through overloaded (and overheated) cable vaults. In a city that's /going/ to have a richter 9 earthquake when either Mt Rainier blows or w
16:25 JoeJulian hen the Olympic Penninsula that's being pushed toward Seattle by continental shelf compression finally slips, that sucker's going to be gone.
16:27 semiosis johnmark: i believe ec2 is in the equinix ashburn campus
16:27 semiosis purpose-built & far enough from the sea
16:27 Rico29 joined #gluster
16:28 Rico29 hi all
16:28 JoeJulian Melsom: You do have write-behind caching enabled so your writes are expected to be asynchronous.
16:28 Rico29 Just wanted to know how many machine di I need to build a GlusterFS based cluster
16:28 JoeJulian 1
16:28 UnixDev joined #gluster
16:28 JoeJulian Well, technically less than 1.
16:28 Rico29 JoeJulian> with redundancy :)
16:29 JoeJulian I expect the answer you're looking for is 2. :D
16:29 JoeJulian Though I could be really pedantic and offer you something more interesting if you'd like.
16:30 Rico29 ok
16:30 Rico29 I'm lookin g for a FS to build a filer for a datacenter
16:30 ctria joined #gluster
16:31 johnmark heh heh
16:31 NuxRo hi guys, any of you will be at the redhat dev day in london on the 1st?
16:31 * JoeJulian is surprised he got away with the "less than one" answer without question.
16:31 johnmark NuxRo: I will
16:31 johnmark NuxRo: unfrotunately, I will only be of minimal help
16:31 Rico29 JoeJulian> I need redundancy, so "less than 1" qill not do the trick
16:32 Rico29 ;)
16:32 JoeJulian NuxRo: There's a way, but it's pretty unlikely. How's your teleporter technology?
16:32 johnmark :)
16:33 Rico29 so with 2 machines, I can have a redundant solution ?
16:33 Rico29 no need of load-balancer or anything else ?
16:33 JoeJulian yep
16:33 JoeJulian correct
16:33 Rico29 ok
16:34 overclk_ joined #gluster
16:35 Rico29 thanks
16:35 Mo_ joined #gluster
16:41 NuxRo johnmark: what help are you talking about? :)
16:41 NuxRo I was hoping to put some faces on your nicks
16:42 Melsom JoeJulian: Yes, I believe so.
16:42 Melsom http://fpaste.org/kHTP/
16:42 glusterbot Title: Fedora Pastebin - by Fedora Unity (at fpaste.org)
16:42 NuxRo JoeJulian: teleportation is still rusty :>
16:42 Melsom Here's our current configuration
16:42 Melsom Bricks are connected with 3x1Gbit bonding.
16:46 aliguori joined #gluster
16:53 JoeJulian Melsom: Yep, that's what I was going on when I mentioned that. If you don't want write-behind you'll have to disable that by setting performance.write-behind off
16:58 bulde1 joined #gluster
16:59 tc00per Can you drop by my office please.
16:59 tc00per Sorry.
16:59 eightyeight joined #gluster
17:11 JoeJulian tc00per: Am I in trouble?
17:12 tryggvil_ joined #gluster
17:14 tc00per :)
17:17 johnmark NuxRo: ah! well yes, you are welcome to put a face with my nick :)
17:17 johnmark and vice versa
17:17 NuxRo :-)
17:17 NuxRo I'll gladly bother you should I have the chance
17:18 NuxRo have I understood correctly that you have some coupons?
17:19 bulde1 joined #gluster
17:23 seanh-ansca joined #gluster
17:32 Nr18 joined #gluster
17:34 perler joined #gluster
17:36 sjoeboo_ joined #gluster
17:37 tryggvil joined #gluster
17:38 faizan joined #gluster
17:38 johnmark NuxRo: Yes, I do. /msg me
17:40 Gilbs1 joined #gluster
17:47 Gilbs1 Hello all, I see that there is no "showmount" for gluster, so what share name or how do I find what sharename I use to mount on windows?
17:49 JoeJulian It's the volume name. showmount normally works, btw.
17:53 daddmac joined #gluster
17:57 Melsom JoeJulian: Either way, it doesnt seem like it syncs up 100% anyways.
17:57 Melsom Configuration looks okey?
17:59 JoeJulian Melsom: yeah. How are you determining that the replicas are not in sync?
18:00 Melsom filesize basically
18:00 Melsom seems to be 5000kB off at all times
18:00 Melsom can't really checksum a 2TB large file. :p
18:01 JoeJulian Well you /can/ but you'd really need to at least pause the vm. :/
18:01 Melsom the active iscsi-target being the one having 5MB extra.
18:01 Melsom is it just the active state data of the vm, or is it real data missing?
18:03 JoeJulian Neither. The fuse client isn't smart enough to know which one is local so as far as glusterfs is concerned, there is no "active". Must be something else.
18:03 JoeJulian s/enough/enough yet/
18:03 glusterbot What JoeJulian meant to say was: Neither. The fuse client isn't smart enough yet to know which one is local so as far as glusterfs is concerned, there is no "active". Must be something else.
18:04 johnmark Melsom: this isn't replica 3, right?
18:04 JoeJulian What's wrong with replica 3? ;)
18:04 Melsom no, replica 2 i believe?
18:04 johnmark Melsom: good :)
18:04 johnmark JoeJulian: I wish I knew :(
18:04 JoeJulian johnmark: Works for me.
18:04 johnmark JoeJulian: oh right, you're the wizard that we should all strive to emulate ;)
18:04 Melsom we have tried to switch to the other brick by enabling iscsi-target on the other brick, and everything seemed to run fine.
18:05 JoeJulian johnmark: I really don't do anything wizardly.
18:05 Melsom but still, we feel like we still can't trust it when there is actual data missing.
18:05 JoeJulian I just install it and it works.
18:05 johnmark JoeJulian: Best. Ad. Ever.
18:06 johnmark Melsom: I understand that
18:06 Melsom i guess glusterfs is best used for file storage, and not "active" files per say.
18:06 JoeJulian Melsom: I'm thinking that there is no data missing, but rather that it's either in a cache or your method of determining the validity of the data is reading some stale cache.
18:07 Melsom you're thinking that the data might just be cached on the node not having it stored on disk?
18:07 Melsom s/node/brick
18:07 neofob does gluster use md5sum from openssl or it implements its own?
18:07 JoeJulian Again, write-behind caching does just that.
18:07 Melsom What will then happen if we unplug the brick with iscsi-target active.. :p
18:08 johnmark Melsom: great question. One definite way to find out :)
18:08 Melsom haha, true.
18:08 JoeJulian Melsom: If you just unplug it, activity will pause for ping-timeout (42 seconds by default). Your iscsi service will fail-over and everything should proceed as normal.
18:09 Melsom Yep, I'm guessing that aswell.
18:09 JoeJulian You do a lot of guessing. ;)
18:09 Melsom :)
18:10 Melsom It just irritates me that the data isn't identical between the bricks, although it might be, counting the cache.
18:12 JoeJulian Melsom: What filesystem are you using on your bricks?
18:12 Melsom xfs
18:14 JoeJulian I swear there must be some combination of system configuration (excluding glusterfs from the equation) that causes ls to return stale data from a hard drive. I've seen too many people say the same thing but it goes against everything I ever see in my own experience and against the majority of what's reported.
18:15 UnixDev how does enforce-quorum help scenarios when you have two nodes in replica 2?
18:16 JoeJulian jdarcy would best answer that if he's around and has power.
18:17 Melsom Sandy?
18:19 JoeJulian yeah
18:19 Melsom :-/
18:23 faizan joined #gluster
18:25 seanh-ansca joined #gluster
18:29 tc00per left #gluster
18:29 tc00per joined #gluster
18:30 Melsom JoeJulian: I do have hardware raid on the nodes with write cache enabled. Shouldn't affect it?
18:32 johnmark UnixDev: that's a feature for 3.4, I think
18:32 johnmark where there will be a "faux" node that is able to determine quorum
18:32 y4m4 joined #gluster
18:33 UnixDev ahh, I see
18:33 UnixDev I thought the code was committed, I'm running svn
18:33 UnixDev any way I can enable it now?
18:33 UnixDev git I mean
18:36 johnmark UnixDev: I actually don't know how far along that feature is. you'd have to check the git logs
18:36 Mo____ joined #gluster
18:36 UnixDev ahh, got it
18:36 johnmark or ask jdarcy or kkeithley when they get here
18:36 UnixDev yeah, I saw it was his commit
18:36 UnixDev ill ask him when I see him around, thank you
18:36 johnmark UnixDev: oh ok
18:36 johnmark cool
18:55 Melsom JoeJulian: Any performance optimization we can do besides what we've already done?
18:56 ekuric joined #gluster
19:02 sjoeboo_ joined #gluster
19:15 UnixDev every time I do a heal, more and more files in split-brain show up… why?
19:21 daddmac should gluster handle this okay? we have three levels of folders with ~500 sub-folders in each folder (500*500*(1-500)/files). would a flat folder structure be better for performance?
19:21 daddmac full disclosure: we have fuse shares on gluster servers mounted onto samba shares, separate servers (client-servers) mounting those shares and re-sharing them out via samba to end-clients.
19:21 daddmac will gluster handle a billion (yes, correct first letter) files of 1 to 30 MB?
19:22 eightyeight joined #gluster
19:38 cbehm daddmac: from my experienced nested is fine. we have a directory (not my fault) with ~25,000 files and you can't really do a directory listing on that
19:39 daddmac wow!  i feel better!
19:39 cbehm our setup is 3 gluster servers replicated
19:40 cbehm i'm not sure what your use case is, but directory listings are probably the most noticeably sluggish experience since it basically forces stat calls for all the files & directories
19:41 cbehm and that's why a 25000 file directory is not a good idea ;)
19:49 randomcamel left #gluster
20:01 davdunc joined #gluster
20:01 davdunc joined #gluster
20:19 badone joined #gluster
20:24 glusterbot` joined #gluster
20:43 UnixDev joined #gluster
20:47 jdarcy joined #gluster
20:48 daddmac we have 4 servers.  one is responsive (2 second response to an ls), and the other is closer to 11 seconds.  i noticed that the responsive one has all the memory allocated. while the slow one has almost no memory allocated.  i'm looking closer, but any ideas or insights are more than welcome!
20:49 Gilbs1 close to 30 seconds at times is more like it :)
21:01 UnixDev jdarcy: I saw some commits you made with respect to adding quorum support… are there any docs on these? I am running git version and it could help with my split-brains…
21:03 jdarcy UnixDev: Hm.  There's built-in documentation associated with the options, but "gluster volume set help" doesn't show them because they're marked NO_DOC.  I wonder why I did that.
21:04 UnixDev lol
21:05 UnixDev jdarcy: it seems that in my failover scenarios I send up with split-brains on small little lock files vmware uses
21:06 UnixDev jdarcy: once this happens, I really can't fix it, even by using JoeJulian's tool for deleting one of the versions of the files… is there an easy way to fix split-brain and delete all effected files?
21:06 jdarcy The inline docs are viewable in the patch: http://review.gluster.com/#patch,sidebyside,743,2,xlators/cluster/afr/src/afr.c
21:06 glusterbot Title: Gerrit Code Review (at review.gluster.com)
21:08 jdarcy Basically, you can set cluster.quorum-type to none/auto/fixed.  For fixed, you need to set cluster.quorum-count as well.  For auto, the effective quorum is N/2 rounded up, or exactly N/2 if that set includes the first named replica.
21:09 jdarcy If quorum enforcement is enabled and you don't have quorum (note: within that AFR volume, which might be only part of a higher-level DHT volume) then any modifying operation - write, create, unlink, rename, chmod, setxattr - will fail with EROFS.
21:09 UnixDev jdarcy: someone mentioned something about a "faux" node to deal with scenarios with only 2 nodes in replica 2?
21:09 UnixDev is that just a client that hooked in as a peer?
21:11 jdarcy UnixDev: There are a few approaches in development to make quorum better.  Quorum of servers cluster-wide (vs. current quorum of bricks within an AFR volume) is one option, arbiter/observer entities to break ties are another.  They're kind of orthogonal to one another.
21:12 jdarcy Personally, I like the arbiter/observer approach.  It bugs me to think that a volume might lose quorum because a server that has no bricks for that volume is unavailable.
21:14 Melsom Is there any other optimizations we can do beside what we've already done? http://fpaste.org/uqNm/
21:14 glusterbot Title: Viewing Paste #247928 (at fpaste.org)
21:14 UnixDev jdarcy: I agree. Is it more simplistic now? Im anxious to get something to help the split-brain problems I'm dealing with on failover. Its equally possible it could be caused by some other bug dealing with same name files or something.
21:15 Melsom The two bricks are connected with 3x1Gbit bonding, but we're only seeing about 140MB/s over the connection.
21:16 Melsom HDDs is HW RAID 10, capable of much more.
21:16 jdarcy UnixDev: Right now it's pretty simplistic.  You can enable it, as long as you understand that with replica 2 you might just get EROFS errors (when that first server goes down) instead of split-brain.  Still an improvement IMO, but something to think about.
21:18 jdarcy Melsom: Are you using replication?
21:18 Melsom jdarcy: yes
21:18 jdarcy Melsom: So you're getting 140MB/s out of 187.5MB/s theoretical max (3x1GbE divided by two for replication)?
21:19 Melsom Sorry, seems to be around 130MB/s, but yes.
21:19 Melsom I thought we could max the link both ways?
21:20 _Bryan_ Is there a way to clear the stale NFS file handle error messages from a client with native gluster mounts...without unmounting and remounting?
21:20 jdarcy Melsom: Are you doing I/O on the servers, or from a third client machine?
21:21 Melsom On the servers
21:22 balunasj joined #gluster
21:22 _Bryan_ I am just trying to clean up my log files from self heals... - Should mention Gluster 3.2.5
21:23 UnixDev jdarcy: with the EROFS scenario, will the surviving node eventually allow write? or...
21:23 benner joined #gluster
21:23 jdarcy Melsom: OK, so yeah, you should be able to get close to 3Gb/s (375MB/s) each way - assuming your network's configured correctly for full duplex, jumbo frames, etc.
21:24 jdarcy Melsom: Also, which bonding mode you use *really matters* but I don't remember which ones are good and which are bad.
21:25 jdarcy UnixDev: In the current version, basically no writing can occur without that first node.  Quorum with N=2 is just bad, which is why we're trying various approaches to make the quorum set larger.
21:25 UnixDev Melsom: I get full throughput on 4x bonded 1g, make sure switch supports 802.3ad, options "miimon=100 mode=4 lacp_rate=1 xmit_hash_policy=layer3+4 ad_select=1"
21:25 jdarcy _Bryan_: I'd like to give you an answer, but TBH it's an area of the code (especially in 3.2.x) that I really don't know well enough.
21:26 Melsom UnixDev: The servers is connected directly
21:26 Melsom Can i still use mode 4?
21:26 UnixDev don't know, try it
21:26 Melsom Currently using mode 0
21:26 UnixDev mode 0 = 2.3gb max for 4x1gig
21:26 UnixDev so it sounds about right for you with 3x
21:27 Melsom :)
21:27 Melsom Good news then :)
21:27 jdarcy Oops, dinner time.
21:27 _Bryan_ jdarcy: I understand...
21:28 UnixDev jdarcy: so really, then there is no way to active my goal with the current master branch… correct? (i.e. one fails, writes continue)
21:28 UnixDev achieve*
21:29 Melsom UnixDev: Currently a live system, so switching bonding mode on the fly is a bit risky.. :p
21:30 Melsom but whats the worst that can happen? re-sync when bond comes up again?
21:31 Melsom UnixDev: any performance increase of using jumbo frames when the servers is directly linked?
21:32 Technicool joined #gluster
21:35 daddmac i'm trying to resolve a performance problem.  i have one node that is very fast, another that is slower.  i noticed the fast one has one glusterfs and 5+ glusterfsd processes, the slow ones all have three glusterfs (and zero or one glusterfsd.)   i'm confused as to to the roles of the two, why glusterfsd would only be present on one (node) and not the other.  links to info are welcome, i haven't had much luck finding any th
21:36 Fabiom Does distributed replication act more like RAID 0+1 than RAID1+0. 4 servers 1 brick replicacount=2. After adding two servers (to make 4servers) I see the data split now between first set of 2 of second set of 2. Any links/blog posts that explain Distributed Replication in detail :)
21:37 JoeJulian ~processes | daddmac
21:37 glusterbot daddmac: the GlusterFS core uses three process names: glusterd (management daemon, one per server); glusterfsd (brick export daemon, one per brick); glusterfs (FUSE client, one per client mount point; also NFS daemon, one per server). There are also two auxiliary processes: gsyncd (for geo-replication) and glustershd (for automatic self-heal). See http://goo.gl/hJBvL for more
21:37 glusterbot information.
21:38 Melsom how should i proceed when changing bonding modes between two bricks?
21:38 JoeJulian Fabiom: No. GlusterFS works at the file level. It is not block-based storage so everything you know about raid doesn't really apply.
21:38 Melsom keeping everything online
21:39 Fabiom JoeJulian: Ok. Any documentation to help me understand at the file-level whats happening. How the files are distributed amongst the nodes ?
21:40 ctria joined #gluster
21:40 semiosis Fabiom: filenames are hashed & the hash is used to distribute files (mostly evenly) among the bricks or replica-sets
21:41 JoeJulian Well, this does a good job of explaining distribute. I'll (hopefully) upload the bit I did on replication after I get home and fix my son's front breaks on his bike. http://www.gluster.org/community/docum​entation/index.php/GlusterFS_Concepts
21:41 Fabiom semiosis: Ok.
21:41 JoeJulian s/breaks/brakes/
21:41 JoeJulian lol
21:41 glusterbot What JoeJulian meant to say was: Well, this does a good job of explaining distribute. I'll (hopefully) upload the bit I did on replication after I get home and fix my son's front brakes on his bike. http://www.gluster.org/community/docum​entation/index.php/GlusterFS_Concepts
21:43 JoeJulian johnmark: If you're going to send me anything else for my sasag presentation, you need to do it fast. It's November 8th.
21:44 Fabiom JoeJulian: thanks. I will read this.
21:44 Fabiom JoeJulian: good luck with the bike!
21:45 JoeJulian Heh, thanks.
21:45 Melsom JoeJulian: Any clue as to how I shall proceed when changing bonding modes between two bricks in our two brick gluster?
21:46 Melsom Hopefully keeping everything online on one of the nodes while we're changing modes.
21:46 UnixDev Melsom: yes, jumbo frames matter. use them
21:47 Melsom 9000?
21:47 UnixDev JoeJulian: how can I remove files listed in split brain? tried tool on your blog, can't remove them from the split-brain list
21:48 UnixDev Melsom: yes
21:48 Melsom UnixDev: When using bonding, should i set MTU=9000 on the slave nics only?
21:48 Melsom or the bonding interface aswell?
21:48 UnixDev set it on the bond, the slaves will follow
21:49 aliguori joined #gluster
21:49 JoeJulian UnixDev: The list doesn't clear. It's a timestamped log. If the split-brained file doesn't appear with a timestamp after you healed it, then it's healed.
21:49 JoeJulian @meh
21:49 glusterbot JoeJulian: I'm not happy about it either
21:51 daddmac thanks glusterbot!  your my hero!
21:51 daddmac (you too joe)
21:51 JoeJulian :)
21:52 JoeJulian Holy cow! Does this mean that 1,2, & 3 will be remade in a way that doesn't suck?
21:52 JoeJulian http://www.foxnews.com/entertainment/2012/10/30/​disney-buying-tar-wars-maker-lucasfilm-for-405b/
21:52 glusterbot Title: Disney buying Star Wars maker Lucasfilm for $4.05B | Fox News (at www.foxnews.com)
21:52 semiosis tar wars?
21:54 UnixDev joined #gluster
21:54 UnixDev is there a chat log of this chan somewhere?
21:55 JoeJulian yes
21:55 JoeJulian pdurbin: where is it again?
21:55 JoeJulian Someone should probably change the topic...
21:55 semiosis someone = a2
21:55 JoeJulian Or he should just unlock it and use glusterbot.
21:55 semiosis @learn chat logs as http://irclog.perlgeek.de/gluster/
21:55 glusterbot semiosis: The operation succeeded.
22:01 a2 semiosis?
22:02 semiosis a2: ohai!
22:02 purpleidea joined #gluster
22:02 purpleidea joined #gluster
22:02 semiosis we were just talking about updating or adding a link to chat logs in the /topic
22:02 a2 joined #gluster
22:03 a2 please make any changes
22:03 semiosis i have the powerrrrrr
22:03 semiosis a2: thanks
22:04 Topic for #gluster is now  Gluster Community - http://gluster.org | Q&A - http://community.gluster.org/ | Patches - http://review.gluster.org/ | Developers go to #gluster-dev | Channel Logs - http://irclog.perlgeek.de/gluster/
22:04 copec joined #gluster
22:05 gbrand_ joined #gluster
22:05 semiosis hm, i thought normal chanops couldn't change topic... must have been confused about that
22:06 a2 glusterbot should be able to change
22:06 SpeeR joined #gluster
22:06 semiosis oh yeah now i remember, i tried getting glusterbot to enable topic protection, which normal chanops can't do
22:07 semiosis chanops can set topic tho
22:07 semiosis a2: sorry to bother you :)
22:08 Melsom any way of changing the 42 second timeout without interupting anything?
22:08 UnixDev JoeJulian: is there anyway to clear that log then? or will it grow forever as the vol gets older?
22:09 JoeJulian Melsom: What a good idea. I don't know why I hadn't thought of that sooner. It would be good to be able to essentially say "Yes, that server truly is dead. Continue without waiting. cc: a2
22:09 JoeJulian UnixDev: I think it expires.
22:10 Melsom Just want to lower it temporarily while we're changing the bond mode between the servers JoeJulian.
22:10 JoeJulian I know that I have some that no longer have entries that used to.
22:10 UnixDev JoeJulian: it seems confusing. as you keep doing heals, the list could grow.. i.e. if you missed a few files… what do you think? maybe it should rotate the log after every heal?
22:11 JoeJulian Melsom: gluster volume set $vol network.ping-timeout N
22:12 JoeJulian UnixDev: My own preference would be that once a split-brain file is healed, it was removed.
22:12 UnixDev i love it!
22:12 Melsom JoeJulian: That way, we could change the bonds, and essentially just have a 5 second freeze?
22:12 JoeJulian file a bug for the enhancement and I'll add myself to the cc list.
22:12 glusterbot https://bugzilla.redhat.com/en​ter_bug.cgi?product=GlusterFS
22:13 JoeJulian Melsom: yes
22:13 Melsom Thanks :)
22:14 UnixDev JoeJulian: will do
22:16 JoeJulian Harumph. Bug 848331 wasn't important when I filed it, but as soon as Intuit has a problem with it, suddenly it gets attention. :P
22:16 glusterbot Bug https://bugzilla.redhat.com​:443/show_bug.cgi?id=848331 low, high, ---, amarts, ASSIGNED , deadlock related to transparent hugepage migration in kernels >= 2.6.32
22:17 UnixDev_ joined #gluster
22:19 sazified joined #gluster
22:32 Gilbs1 left #gluster
22:43 benner joined #gluster
22:50 UnixDev JoeJulian: lol…funny how that is
22:52 shireesh joined #gluster
23:01 lanning joined #gluster
23:08 daddmac left #gluster
23:15 kevein joined #gluster
23:27 ctria joined #gluster
23:49 usrlocalsbin joined #gluster
23:58 UnixDev joined #gluster
23:59 UnixDev_ joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary