Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2013-11-18

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:42 sprachgenerator joined #gluster
01:08 _pol joined #gluster
01:24 bala joined #gluster
01:47 harish joined #gluster
02:01 asias joined #gluster
02:39 harish joined #gluster
03:02 sgowda joined #gluster
03:03 _pol joined #gluster
03:05 nasso joined #gluster
03:15 bharata-rao joined #gluster
03:39 ThatGraemeGuy joined #gluster
03:45 itisravi joined #gluster
03:52 hagarth joined #gluster
03:55 ngoswami joined #gluster
03:56 shylesh joined #gluster
03:57 _pol joined #gluster
04:06 meghanam joined #gluster
04:14 _pol joined #gluster
04:18 shyam joined #gluster
04:46 RameshN joined #gluster
04:48 ppai joined #gluster
05:01 shruti joined #gluster
05:02 dusmant joined #gluster
05:20 _pol_ joined #gluster
05:21 ndarshan joined #gluster
05:23 hateya joined #gluster
05:25 DV joined #gluster
05:26 CheRi joined #gluster
05:30 lalatenduM joined #gluster
05:34 kevein joined #gluster
05:36 raghu joined #gluster
05:37 rastar joined #gluster
05:44 psharma joined #gluster
05:47 ndarshan joined #gluster
05:55 asias joined #gluster
06:06 darshan joined #gluster
06:08 raghu left #gluster
06:08 itisravi joined #gluster
06:08 raghu joined #gluster
06:09 badone joined #gluster
06:10 _pol joined #gluster
06:12 dusmant joined #gluster
06:13 vimal joined #gluster
06:15 saurabh joined #gluster
06:20 mohankumar__ joined #gluster
06:38 satheesh1 joined #gluster
06:40 T0aD joined #gluster
06:44 nshaikh joined #gluster
06:45 X3NQ joined #gluster
06:47 X3NQ joined #gluster
06:56 shyam joined #gluster
06:58 Rio_S2 joined #gluster
07:00 dkorzhevin joined #gluster
07:01 T0aD joined #gluster
07:04 KORG joined #gluster
07:05 _pol joined #gluster
07:12 vshankar joined #gluster
07:15 asias_ joined #gluster
07:20 kanagaraj joined #gluster
07:23 ricky-ticky joined #gluster
07:27 jtux joined #gluster
07:28 T0aD joined #gluster
07:32 satheesh1 joined #gluster
07:34 ngoswami joined #gluster
07:44 T0aD joined #gluster
07:49 shyam joined #gluster
08:00 ctria joined #gluster
08:03 geewiz joined #gluster
08:03 tjikkun_work joined #gluster
08:05 dusmant joined #gluster
08:13 eseyman joined #gluster
08:18 andreask joined #gluster
08:18 andreask joined #gluster
08:19 getup- joined #gluster
08:19 Rio_S2 joined #gluster
08:27 hybrid5121 joined #gluster
08:27 ngoswami joined #gluster
08:28 eseyman joined #gluster
08:30 raar joined #gluster
08:30 keytab joined #gluster
08:32 sgowda joined #gluster
08:39 rastar joined #gluster
08:47 asias__ joined #gluster
08:50 asias_ joined #gluster
09:03 _pol joined #gluster
09:07 calum_ joined #gluster
09:09 Norky joined #gluster
09:15 spandit joined #gluster
09:20 X3NQ joined #gluster
09:20 diegol__ joined #gluster
09:29 DV__ joined #gluster
09:34 ababu joined #gluster
09:41 vshankar joined #gluster
09:46 darshan joined #gluster
09:53 vshankar joined #gluster
09:58 _pol joined #gluster
10:00 anonymus joined #gluster
10:00 anonymus hi guys
10:00 geewiz joined #gluster
10:01 anonymus tell me please if glusterfs geo-replication works fine with 10 nodes or no? or maybe I do not understand it well?
10:04 baoboa joined #gluster
10:10 getup- hi, i'm seeing a lot of management: connection attempt failed on both bricks in a replicated volume, any thouhgts on that?
10:10 getup- that's in etc-glusterfs-glusterd.vol.log by the way
10:18 hybrid5121 joined #gluster
10:24 harish joined #gluster
10:31 sgowda joined #gluster
10:34 raghug joined #gluster
10:37 franc joined #gluster
10:37 franc joined #gluster
10:42 shri joined #gluster
10:43 getup- right, that seemed to be caused by a nfs.disable off and subsequent nfs.disable on, i'm not sure if similar effects are seen on redhat as we're on ubuntu, but a volume reset fixed it for me
10:44 shri shruti: Hi .. ping
10:44 shri shruti: have you tried Libgfapi with latest OpenStack Release Havana ?
10:45 shri I have enabled qemu_allowed_storage_drivers variable in nova.conf as
10:45 shri qemu_allowed_storage_drivers=[gluster]
10:46 shri but openstack is not using libgfapi .. it is going to mount the glusterfs by default
10:46 shri can anyone help me here ?
10:47 hagarth shri: Is this with RDO?
10:47 shri hagarth: I'm trying Devstack on Fedora19
10:47 hagarth getup-: this patch should address your problem. http://review.gluster.org/6293
10:47 glusterbot Title: Gerrit Code Review (at review.gluster.org)
10:48 social joined #gluster
10:48 sgowda joined #gluster
10:48 getup- hagarth: cool, we're not going to use NFS for the moment, i was just playing around when i noticed it
10:49 hagarth shri: how did you figure out that qemu is not using libgfapi?
10:49 shri hagarth: becaue nova & cinder both are using mounted Gluster fs and in ps ax | grep qemu
10:49 shri I could not see IP/volume_name
10:50 hagarth shri: ok, this might be related to https://bugzilla.redhat.com/show_bug.cgi?id=1020979
10:50 glusterbot <http://goo.gl/w4BSBu> (at bugzilla.redhat.com)
10:50 glusterbot Bug 1020979: unspecified, unspecified, rc, eharney, NEW , After configuring cinder for libgfapi, volumes create but do not attach
10:50 shri hagarth: when my instance are running they are using kvm..libvirt .. in that for device-file I can see glusterfs mount path  and not IP/volume
10:51 hagarth shri: one more change is needed to get libgfapi working.. let me try pulling that out for you
10:51 getup- is there some documentation on how the healing process works of glusterfs? how does it notice when files need to be healed for example?
10:51 shri hagarth: in my case with mounted glusterfs I able to attach volumem able to boot from it , everything working but cinder/nova not using libgfapi
10:52 _pol joined #gluster
10:53 shri hagarth: in my case I'm successfully able to attached bootable cinder volume to nova instance .. and even after that instance is running successfully
10:54 hagarth shri: this was necessary in one of our internal testing to get libgfapi to work
10:54 shri hagarth: but they are using Mounted glusterfs and NOT libgfapi :(
10:54 hagarth on all compute nodes:
10:54 hagarth sed -i "s/conf.source_ports = \[None\]/conf.source_ports = \[\'24007\'\]/"
10:54 hagarth /usr/lib/python2.6/site-packages/nova/virt/libvirt/volume.py
10:54 purpleidea I wanted to let you all know that I've broken down, cried a lot, and finally wrote some proper? documentation for puppet-gluster. Comments, spelling/grammar bug fixes, and helpful suggestions are welcome!
10:54 purpleidea https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md
10:54 glusterbot <http://goo.gl/WVxxEE> (at github.com)
10:54 purpleidea also available as a pdf: https://github.com/purpleidea/puppet-gluster/blob/master/puppet-gluster-documentation.pdf
10:54 glusterbot <http://goo.gl/6OSk1S> (at github.com)
10:54 hagarth purpleidea: awesome!
10:54 calum_ joined #gluster
10:55 purpleidea hagarth: i guess :P ironically, one reason i started to write puppet code was to avoid writing documentation. oh well. comments appreciated.
10:55 social what spawns /usr/bin/python /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py --version
10:56 hagarth purpleidea: we just cannot avoid documentation :P
10:56 hagarth social: glusterd spawns that if geo-replication is configured
10:56 shri hagarth: so iat your site.. is libgfapi worked with openstack ?
10:56 hagarth shri: yes
10:56 purpleidea hagarth: indeed. i guess it was time. it's harder to do since i'm not getting paid, but hopefully it will help the new puppet-gluster users :P *cough* have you tried puppet-gluster?
10:57 social I have issue when I DoS glusterd with gluster volume .. status/profile/heal requests, after the dos geo-replication spawns toousands of gsyncd.py and that invokes in the end oom-kill which kills glusterd
10:58 hagarth purpleidea: i haven't but know of people who are trying out your modules
10:58 hagarth social: how many such requests do you fire?
10:59 purpleidea hagarth: cool, well if they get stuck, send them my way, after they've looked at the docs and know what was missing for them to be unstuck ;)
10:59 hagarth purpleidea: sure, will do.
10:59 social hagarth: it's hard to tell as the machine gets hammered but several tousands if you ask about gsyncd
10:59 purpleidea hagarth: cheers!
11:00 shri hagarth: so you mean I need to changes here
11:00 hagarth social: ok, do you have geo-replication configured at all?
11:00 DV joined #gluster
11:00 shri cat  nova/nova/virt/libvirt/volume.py| grep -i conf.source_port
11:00 shri conf.source_ports = netdisk_properties.get('ports', [])
11:00 shri conf.source_ports = [None]
11:01 hagarth shri: yes
11:01 shri hagarth: so this not available in latest openstack - Havana release right ??
11:01 social my reproducer consists of 3 nodes, nodes 1 and 2 are in replica pair having shared volume, I'm running georeplicationg from node 2 to node 3 and running bonnie++ on mounted volume on node 1 and doing while+for cycle of gluster volume status/heal... commands
11:02 hagarth shri: not yet, afaiu
11:02 social hagarth: when I kill the status/heal commands after bonnie++ finishes geo-replication on node 2 kicks in and hammers it down
11:02 shri hagarth: :)
11:02 hagarth social: I think it might be related to volume status
11:02 shri hagarth: configuaring many variable .. will add this also :)
11:03 hagarth shri: ok :), let me know if you need assistance in getting this working
11:05 hagarth social: can you check if the gsyncd processes keep growing if you do not perform volume status?
11:05 social hagarth: it's not that it would grow, it's that it gets spawned so many times
11:05 shri hagarth: Thanks for kind word :) sure  I will ping when needed...now I will try with above changes
11:06 hagarth social: err, rather I meant the increase in number of gsyncd processes
11:12 social hagarth: thw funny thing is that if I move geo-replication to node 1 and still dos node 2 the oom-kill happends on node 2 so hmm it might not be gsyncd :/
11:12 DV joined #gluster
11:12 tjikkun_work joined #gluster
11:12 hagarth social: what gets oom-killed on node 2? is it glusterd?
11:13 andreask1 joined #gluster
11:13 andreask joined #gluster
11:13 social hagarth: well this time it got cassandra so no, it just kills biggest process at the moment, if it's empty node than yes glusterd
11:14 hagarth social: I see
11:18 social hagarth: :( now it killed glusterd on node 1 while I ran everything on node 2
11:18 hagarth social: just curious, how much physical memory + swap do you have on node 1 and 2?
11:19 social 8GB
11:20 social happends on production from time to time with 64GB big nodes
11:22 hagarth social: are you using nfs?
11:22 social nope
11:24 hagarth social: what is the size of physical ram on the node that you run bonnie++?
11:25 social hagarth: ulimit -n 110000; bonnie++ -d ./ -n 16:10000:16:64 -s 15736 -u root:root < -s is twice of ram in mb
11:27 anonymus left #gluster
11:27 tjikkun_work joined #gluster
11:27 hagarth social: would it be possible to try with a lesser -s, say something like 4096 and see if the oom persists?
11:28 social hagarth: it happends with quite different workloads, I don't think the bonnie++ has anything to do with it, it's just cheap way to get gluster under pressure
11:29 social I can probably reproduce this with while loop :/
11:31 hagarth social: I have seen some crazy stuff with bonnie++, haven't seen too many real world applications doing that (byte by byte write of a file which is twice the size of physical memory)?
11:32 hagarth social: nevertheless, your observations are certainly worth a bug report
11:39 getup- joined #gluster
11:44 getup- joined #gluster
11:46 _pol joined #gluster
11:53 kanagaraj joined #gluster
11:55 diegows joined #gluster
11:56 lpabon joined #gluster
11:59 rcheleguini joined #gluster
12:01 calum_ joined #gluster
12:04 spandit joined #gluster
12:17 CheRi joined #gluster
12:20 ppai joined #gluster
12:26 DV joined #gluster
12:29 bala joined #gluster
12:29 getup- joined #gluster
12:29 rwheeler joined #gluster
12:40 shri joined #gluster
12:43 getup- joined #gluster
12:47 DV joined #gluster
12:53 bfoster joined #gluster
12:55 ppai joined #gluster
12:55 CheRi joined #gluster
12:57 hagarth joined #gluster
13:00 andreask joined #gluster
13:03 ricky-ticky joined #gluster
13:04 harish joined #gluster
13:05 calum_ joined #gluster
13:10 kkeithley joined #gluster
13:15 DataBeaver joined #gluster
13:19 dusmant joined #gluster
13:21 getup- i see the following in our log files Unable to self-heal contents of '/test.sh' (possible split-brain)., but if i do a info split brain on the volume it says nothing is wrong
13:23 getup- any ideas?
13:32 social hagarth: http://paste.fedoraproject.org/54779/84781489 these are last words before oomkil on both nodes :/
13:32 glusterbot Title: #54779 Fedora Project Pastebin (at paste.fedoraproject.org)
13:34 _pol joined #gluster
13:35 calum_ joined #gluster
13:38 chirino joined #gluster
13:40 B21956 joined #gluster
13:41 DataBeaver joined #gluster
13:51 lpabon joined #gluster
13:56 davidbierce joined #gluster
13:57 calum_ joined #gluster
14:01 ctria joined #gluster
14:08 plarsen joined #gluster
14:11 ira joined #gluster
14:11 calum_ joined #gluster
14:13 mattf joined #gluster
14:14 shri_ joined #gluster
14:26 neofob joined #gluster
14:27 hybrid5121 Hi
14:27 glusterbot hybrid5121: Despite the fact that friendly greetings are nice, please ask your question. Carefully identify your problem in such a way that when a volunteer has a few minutes, they can offer you a potential solution. These are volunteers, so be patient. Answers may come in a few minutes, or may take hours. If you're still in the channel, someone will eventually offer an answer.
14:28 hybrid5121 I have 2 questions :
14:28 _pol joined #gluster
14:28 hybrid5121 * is it adviseable to use direct-io-mode=disable for dynamic websites hosting (PHP code) ?
14:29 hybrid5121 * is it advisable to use direct-io-mode=disable for many writes storage (log files) ?
14:29 hybrid5121 If yes ... in which case is it usefull to have direct-io-mode=enable ?
14:37 tqrst joined #gluster
14:41 fixxxermet joined #gluster
14:43 bennyturns joined #gluster
14:45 failshell joined #gluster
14:58 tqrst has support for uneven brick sizes improved since 3.3? Last I checked, the balancing algorithm assumed that everything was even.
14:59 tqrst (just had a drive failure and all I have on hand are 4T drives instead of 2)
15:01 fixxxermet left #gluster
15:06 _pol joined #gluster
15:08 andreask joined #gluster
15:10 sohoo joined #gluster
15:12 sohoo hello all, does anyone knows whats wronge with this iptables rules? i try to block all clients(mounts) and grant access just to gluster server untill self-heal finished, but it looks like gluster is disconnected ffrom the cluster(servers on subnet 192.168.14.0/24) which allowed
15:12 sohoo iptables -A INPUT -i lo -j ACCEPT
15:12 sohoo iptables -A INPUT -s 192.168.14.0/24 -j ACCEPT
15:12 sohoo iptables -A OUTPUT -d 192.168.14.0/24 -j ACCEPT
15:12 sohoo iptables -A INPUT -d 192.168.14.0/24 -j ACCEPT
15:12 sohoo iptables -A OUTPUT -s 192.168.14.0/24 -j ACCEPT
15:12 sohoo iptables -A INPUT -m state --state NEW -j DROP
15:12 sohoo iptables -A OUTPUT -m state --state NEW -j DROP
15:13 sohoo ssh from and other ports can access the server(no issues) just gluster inner com etc..
15:14 sohoo very strange
15:15 jbautista|brb joined #gluster
15:16 bugs_ joined #gluster
15:24 _pol joined #gluster
15:25 jskinner_ joined #gluster
15:35 nshaikh joined #gluster
15:39 sohoo anybody here has usecase for tc and slef-heal. there are times where self-heal make the all cluster unresponsive for long long hours
15:44 ndk joined #gluster
15:46 rcheleguini joined #gluster
15:47 samppah sohoo: what glusterfs version you are using?
15:53 zerick joined #gluster
15:56 rwheeler joined #gluster
15:56 sohoo 3.3
15:57 kaptk2 joined #gluster
15:59 LoudNoises joined #gluster
16:01 bgpepi joined #gluster
16:03 diegol__ joined #gluster
16:04 Technicool joined #gluster
16:05 KevinMc joined #gluster
16:06 sohoo anybody has some tips regarding that issue, real disaster :)
16:10 ngoswami joined #gluster
16:11 ngoswami joined #gluster
16:15 jbrooks joined #gluster
16:16 Guest19728 joined #gluster
16:27 aib_233 joined #gluster
16:27 KevinMc I'm having an issue sharing a gluster volume over CIFS.  Ultimately, I am seeing horrible read performance on small files which I've narrowed down to a per file lag caused by STATUS_OBJECT_NOT_FOUND errors on the server for every 'NT Create AndX' Request.  My gluster server version is 3.3.1, client (on the SAMBA server) is 3.4.0 and SMB is 3.5.10.  Anybody have any ideas?
16:29 lanning joined #gluster
16:30 Alpinist joined #gluster
16:36 Kins joined #gluster
16:46 aib_233 building a cluster of 2 servers. using the puppet module and I don't know much about gluster apart from the basics. what is the correct procedure to peer the 2 hosts? It works for server1->server2 but I'm getting in the other direction. "is already part of another cluster"
16:47 aib_233 do I only need to run the peer commands from the first server?
16:51 ira joined #gluster
16:59 lbalbalba joined #gluster
17:01 lbalbalba aib_233: what does 'gluster peer status' show on each node ?
17:02 aib_233 http://pastebin.com/CBDFjVWE
17:02 glusterbot Please use http://fpaste.org or http://paste.ubuntu.com/ . pb has too many ads. Say @paste in channel for info about paste utils.
17:04 aib_233 ok, i just realized that i'm having a firewall configuration issue. I'll flush all rules and retry
17:04 lbalbalba ah
17:05 lbalbalba btw, you are encouraged to use hostnames nstead of ip's
17:05 lbalbalba its easier to change the ip of an hostname than it is to change the ip's in the gluster config
17:08 calum_ joined #gluster
17:10 aliguori joined #gluster
17:12 giannello joined #gluster
17:20 semiosis @later tell kseifried pong
17:20 glusterbot semiosis: The operation succeeded.
17:22 geewiz joined #gluster
17:22 clarkee hi guys
17:23 clarkee i'm going to be using gluster for ovirt storage, building the bricks and volume in the cli and not ovirt
17:23 clarkee will ovirt still be able to use the fast access via api rather than fuse??
17:23 clarkee also - is snapshot support coming soon? :(
17:23 semiosis @later tell kseifried see the "Talk" side of that wiki page: http://www.gluster.org/community/documentation/index.php/Talk:Getting_started_setup_aws
17:23 glusterbot semiosis: The operation succeeded.
17:33 Mo_ joined #gluster
17:35 bma joined #gluster
17:35 bma hey
17:35 bma i have installed and configured a glusterFS ufo
17:36 bma i found javaswift... a java client por swift
17:36 bma once glusterUFO also uses swift
17:36 bma is it possible to use javaswift to connect to my platform
17:37 _pol joined #gluster
17:37 bma im not using gluster with openstack
17:37 semiosis bma: try it & let us know
17:38 bma iam having a problem when i try to authenticate
17:39 semiosis i'm not familiar with swift/ufo but stay around, someone else might have an idea
17:39 bma in the javaswift have this sample to authenticate Account account = new AccountFactory().setUsername(username).setPassword(password).setAuthUrl(url).setTenant(tenant).createAccount();
17:39 bma i use it
17:39 semiosis in the mean time, you should probably look for log files on the server, probably in /var/log/glusterfs
17:39 bma and i get the 401 status code
17:40 bma one thing... how can i get the tenant name in glusterfs?
17:41 bma sorry... i get 400 status code => UNKNOWN
17:46 _pol joined #gluster
17:54 diegol__ joined #gluster
17:54 Guest67977 joined #gluster
18:08 elyograg just noticed a problem.  Still on 3.3.1, not sure if it's a problem in newer versions.  i'll be trying 3.4.1 after I test (and hopefully recreate) my rebalance problems on 3.3.1.
18:09 elyograg the problem: If you set a quota on / that exceeds the size of your volume, df on the volume will increase to the quota size and 'used' space will increase by the same amount.
18:10 elyograg s/df/total space seen with df/
18:10 glusterbot elyograg: Error: I couldn't find a message matching that criteria in my history of 1000 messages.
18:11 elyograg glusterbot: if you start singing Daisy, I'm leaving. :)
18:16 * glusterbot starts humming Daisy ...
18:24 bulde joined #gluster
18:31 nueces joined #gluster
18:32 Cenbe joined #gluster
18:35 elyograg heh.
18:36 jskinner joined #gluster
18:37 cjh973 joined #gluster
18:43 diegol__ joined #gluster
18:45 andreask joined #gluster
18:49 JoeJulian bma: You get the tenant name when you configure the tenant. How you do that is authentication middleware dependent.
18:49 hateya joined #gluster
18:53 _pol joined #gluster
18:56 diegol__ joined #gluster
19:03 failshell joined #gluster
19:13 Cenbe joined #gluster
19:14 elyograg what would make a my gluster testbed mount the volume as /tmp/mntNlNTUd
19:14 elyograg ?
19:16 elyograg a few minutes ago, it was also mounted as /tmp/mnttjJAuh ... I do have it mounted already as /mnt/testvol
19:19 elyograg df output: http://apaste.info/dPrV
19:19 glusterbot Title: Paste #dPrV - Apache Paste Bucket (at apaste.info)
19:20 elyograg everything up through /mnt/usb2 was already mounted.  The last two were done automatically by some process on the machine.
19:23 elyograg the last one has since disappeared.
19:24 MacWinner joined #gluster
19:25 MacWinner do bricks need to be the same size across systems?
19:27 MacWinner and does the fuse client automatically discover other cluster members after you mount it.  I notice that you specifiy one of your cluster servers in the mount command, so I'm not sure how failover would work
19:27 diegol__ joined #gluster
19:27 elyograg MacWinner: if they aren't, results can be unpredictable.  When you are faced with a completely full brick, new files that would have ended up on that brick will be relocated to other bricks.    there may be a slight performance decrease when reading those files.  if you try to append to a file that lives on a completely full brick, that operation will fail, even though other bricks (and the volume as a whole) will say there is available space.
19:28 elyograg there is not any way that I know to affect the file distribution with weights, so a distributed volume with different size bricks *will* fill up smaller bricks before the whole volume is full.
19:28 MacWinner elyograg, thank you!  i've just spent a lot of time looking at other solutions including ceph. I seemed to have fallen for the ceph hype… gluster was so simple to setup, so I think i am going with it.
19:29 elyograg the fuse mount talks to the server named in the mount command, downloads the full volume info, then connects to all bricks.  If you change the volume, existing clients are informed about the changes.
19:29 MacWinner awesome.. that's perfect
19:33 MacWinner in gluster, what is considered "high latency"?  5ms? 10ms? 50ms?
19:33 MacWinner if I have 13ms latency between nodes, is there any obvious issue that will arise?
19:35 sashko joined #gluster
19:35 hagarth joined #gluster
19:35 sashko hey guys
19:35 bgpepi joined #gluster
19:36 elyograg gigabit typically has latency well under 1ms ... or at least that's what you get with a ping.  gibabit is often considered too slow for a gluster install.  10Gb is preferred, or even better is infiniband.
19:36 JoeJulian Depends on the use-case of course.
19:36 elyograg indeed.
19:37 elyograg I'm using gigabit at the moment.  I could definitely use an upgrade. :)
19:37 sashko can anyone shed some light why ls on a dir is faster than find ./ ? is it because find does stat?
19:38 JoeJulian The issue is that the latency will be multiplied for some operations due to the need for multiple round trips. If you open a lot of files, that latency will be very evident. If you leave large files open, not so much.
19:38 elyograg sashko: yes.  a stat initiates a self-heal check.  I don't know what all is involved in that check, but it's expensive.
19:38 JoeJulian sashko: yes. "ls -l" or "ls --color" should be just as slow.
19:39 sashko does anyone know how to do a find without stat?
19:39 sashko i can't find an option
19:39 JoeJulian impossible.
19:39 JoeJulian Find has to know which of the dirents are directories.
19:40 JoeJulian sashko: Are you using 3.4 and a recent kernel?
19:40 elyograg find can filter on all sorts of info, all of which is obtained with a stat.
19:40 sashko JoeJulian: no, 3.2 on 2.16.18 kernel
19:41 JoeJulian Ouch!
19:41 daMaestro joined #gluster
19:41 JoeJulian The changes in 3.4 to use readdirplus in cooperation with fuse updates in the kernel help a lot with that.
19:43 sashko yeah ouch is the right emotion :)
19:44 sashko JoeJulian: which fuse updates? did they go into > 2.6.32.x kernel or also into 2.6.18?
19:44 JoeJulian I know they've been backported into EL6 kernels. I don't know about EL5. Definitely not EL4 though... :D
19:48 rwheeler joined #gluster
19:51 bugs_ joined #gluster
19:53 sashko :-D
20:00 zaitcev joined #gluster
20:01 cjh973 does gluster 3.4 still use the indices/xattr directory to mark files needing heals?
20:03 JoeJulian Yes, that's used for allowing the heal to be performed in an efficient manner, rather than requiring a complete walk of the entire tree every time a server leaves the pool.
20:03 cjh973 JoeJulian: ok cool.
20:04 cjh973 do you know if there's a way to figure out based on the xattr entry which file requires healing?
20:04 cjh973 i was trying to figure this out the other day but didn't get very far
20:08 MacWinner elyograg, the access to the files i'm storing in gluster will be very sporadic (not being used for home directories or anyting). basically it will be used kind of like dropbox to store documents and then periodically access them or download them remotely via our website..  in this use case, would you imaging the 13ms latency to be much of an issue?
20:09 MacWinner elyograg, also, was curious what is your particular use case that is maxing out your gigE?
20:10 ThatGraemeGuy joined #gluster
20:10 spechal joined #gluster
20:11 elyograg MacWinner: storing millions of photographs.  we currently have about 25 terabytes on gluster.  Most of the over 200TB is currently on SAN devices, but we found that it wouldn't scale very well without additional NFS heads, but between the time we put the system together and the time we needed to go a lot bigger, Oracle bought Sun and Solaris became significantly more expensive than free.
20:11 nullck Hi guys, please I had problem on my cluster web using GFS2 with a filesystem, I can migrate to glusterfs, but what version is stable ? and where I can find one stable version or repository ?
20:12 glusterbot New news from newglusterbugs: [Bug 1031164] After mounting a GlusterFS NFS export intial cd'ing into directories on a Tru64 resaults in a Permission Denied error <http://goo.gl/oC9dnX> || [Bug 1031166] Can't mount a GlusterFS NFS export if you include a directory in the source field on Tru64 (V4.0 1229 alpha) mount <http://goo.gl/Rh8LGu>
20:12 spechal Can Gluster be setup to replicate over a different IP than the IP used for the client?
20:12 MacWinner elyograg, ahh.. thank you!
20:13 elyograg spechal: gluster can do inter-server communications over a different address than client connections ... but in normal situations, replication is not done as an inter-server communication - the client connects to all bricks and it's the client that writes the data more than once.
20:14 spechal do you know where I can find documentation regarding the inter-server communications and setup?
20:16 elyograg not sure there is any.  I described my setup that uses different LANs for inter-server and client stuff on the mailing list, though.  http://www.mail-archive.com/gluster-users@gluster.org/msg13372.html
20:16 glusterbot <http://goo.gl/9BE5YF> (at www.mail-archive.com)
20:17 elyograg I accomplished it by putting different info in /etc/hosts on the machines with bricks.
20:18 JoeJulian cjh973: Just giving it a cursory look, I'm not sure. xattrop-5898eb68-a593-498d-a796-df44d8645e0c looks like that should match a gfid. Although I have that entry, I have no file with that gfid so I'm not sure. When a file is "dirty", it's gfid is hardlinked to that inode. Perhaps that xattrop entry is the uuid of a self-heal daemon? I'll try to remember to dig in to the source later and look. I've got to get over to the CoLo though, so I can't
20:18 JoeJulian spend a lot of time on this right now.
20:20 cjh973 JoeJulian: ok cool.  thanks for the help.  I need to start reading the source also
20:20 JoeJulian nullck: We're recommending 3.4.2
20:20 JoeJulian nullck: Which just happens to be the ,,(latest)
20:20 glusterbot nullck: The latest version is available at http://goo.gl/zO0Fa . There is a .repo file for yum or see @ppa for ubuntu.
20:22 elyograg is 3.4.2 actually done and released?  it's not on the main download link from gluster.org yet.
20:22 * JoeJulian needs more coffee.... 3.4.1
20:23 JoeJulian Why did I get up at 4:00 with my wife?
20:23 elyograg i hate days like that.  seem to have them a lot.  less so now that I hardly ever drink aany caffeine, though.
20:24 ThatGraemeGuy joined #gluster
20:24 elyograg my caffeine of choice was Dr. Pepper.  Didn't drink a lot of coffee.
20:25 JoeJulian I just have 1 espresso each morning. Dr. Pepper if I need more.
20:25 nullck JoeJulian, ok, thank's
20:28 lpabon joined #gluster
20:28 elyograg glusterbot: please give me the url to file a bug.
20:28 * JoeJulian pokes glusterbot
20:28 * elyograg pokes glusterbot.
20:28 elyograg heh.
20:29 JoeJulian file a bug
20:29 glusterbot http://goo.gl/UUuCq
20:29 elyograg punctuation. :)
20:29 JoeJulian maybe...
20:29 JoeJulian glusterbot: file a bug
20:29 JoeJulian Ah, that's it.
20:29 JoeJulian He's all like, "file a bug isn't a command I recognize" because it's prefixed with his name.
20:29 glusterbot http://goo.gl/UUuCq
20:36 MacWinner when using replica sets, do you always need to have an even multiple of bricks relative to the replica set?  ie, if replica set =3, then you must have 3, 6, 9, 12 etc bricks?
20:37 semiosis MacWinner: yes
20:44 calum_ joined #gluster
20:45 elyograg bug filed. glusterbot ought to be informing us soon.
20:52 diegol__ joined #gluster
20:53 clarkee sooooooo
20:54 clarkee any idea on when snapshots might become available?
20:54 sashko heya semiosis :)
20:55 semiosis hi sashko
20:55 sashko how's it going?
20:55 semiosis clarkee: maybe 3.5?  http://www.gluster.org/community/documentation/index.php/Planning35
20:55 glusterbot <http://goo.gl/l2gjSh> (at www.gluster.org)
20:55 semiosis sashko: going well, you?
20:55 sashko same here
20:57 diegol__ joined #gluster
20:57 clarkee semiosis: sweet :)
20:57 clarkee anybody here used gluster under ovirt?
20:57 sashko do you guys know where I can find info about the new xattr and gfid structure in 3.3 and 3.4?
20:58 sashko things were easier under 3.2 you just looked at files' xattr but now there are other info stored in some bluster dirs?
20:59 elyograg sashko: have you seen JoeJulian's blog post on it from last year?  http://joejulian.name/blog/what-is-this-new-glusterfs-directory-in-33/
20:59 glusterbot <http://goo.gl/j981n> (at joejulian.name)
20:59 sashko oh boy! someone give him a medal!
20:59 sashko thanks elyograg!
21:12 diegol__ joined #gluster
21:12 glusterbot New news from newglusterbugs: [Bug 1031817] Setting a quota for the root of a volume changes the reported volume size <http://goo.gl/ow7PKV>
21:14 lbalbalba joined #gluster
21:14 MacWinner when setting up geo-sync, can you setup bidirection master->slave setups between beachheads in both sites?
21:15 MacWinner or would that cause some sort of weird looping problems?
21:15 rcheleguini joined #gluster
21:16 MacWinner say I have 3 nodes in 2 sites.. node1,2,3 in New york, and node4,5,6 in califorina.. can I reliably setup node1 and node4 to do master->slave between them in each direction?
21:17 muhh joined #gluster
21:17 diegol__ joined #gluster
21:27 JoeJulian MacWinner: Not yet
21:28 MacWinner thanks
21:29 JoeJulian Huh... It didn't even make the "nice to have" for 3.5 planning... http://www.gluster.org/community/documentation/index.php/Planning35#GlusterFS_3.5_Release_Planning
21:29 glusterbot <http://goo.gl/0AplHT> (at www.gluster.org)
21:29 MacWinner JoeJulian, any current best practice for making 2 sites sync in a gluster environment?
21:31 semiosis https://botbot.me/freenode/gluster/msg/6549868/
21:31 glusterbot Title: Logs for #gluster | BotBot.me [o__o] (at botbot.me)
21:32 semiosis rumor is 3.7, a year from now
21:32 JoeJulian Not really. I think most of the time that's needed the decision's been to treat one copy as the write copy and use the closest one as read-only.
21:33 MacWinner would be interesting if Galera replication protocol/library could be used with cross site gluster replication
21:35 JoeJulian I think jdarcy's journal based replication might be what that's waiting on.
21:47 MacWinner JoeJulian, do you see any downside in using something like Unison to keep the 2 beachheads in sync?  with each site will have it's own independent gluster
21:48 P0w3r3d joined #gluster
21:53 JoeJulian Nothing comes to mind
21:55 _disturbed joined #gluster
21:56 _disturbed left #gluster
21:56 johnmark JoeJulian: yeah, journal-based or changelog-based replication
21:56 johnmark will be the basis for multi-master
21:56 johnmark JoeJulian: but I figured it would be 3.6
21:57 JoeJulian Hey, johnmark, where're we going next? ;)
21:57 _dist joined #gluster
21:58 _dist hey there everyone :) anyone around who has compared metrics of the native mount.glusterfs to the qemu backend?
21:59 JoeJulian Are you referring to hosting an image on the native mount vs accessing that image via the api? If so the api seems to be roughly 6 x faster.
22:01 _dist right, that's what I read. But when I did my own testing I found they were extemely close, if anything the native mount was 1-2% faster. The test I used was simple though, dd with oflag=dsync for 1k, 4k, 1M and 500M
22:01 _dist I assume I must be doing somethign wrong
22:01 _dist something*
22:02 JoeJulian raw? qcow2?
22:02 _pol joined #gluster
22:02 _dist raw, writethrough or writeback
22:02 badone joined #gluster
22:03 _dist I was using the boot cd system rescue cd for it
22:04 _dist tested with both a replica 2 and a single node on its' own
22:05 JoeJulian Isn't the slower part more random io or multiple files? Are you maxing out your network with a dd test both ways?
22:05 JoeJulian And, of course my favorite, does testing with dd represent your real-world needs?
22:06 _dist well the first test doesn't even go over network (cause it's only a single brick) but on the replica tests yes network maxed on both native/api
22:06 _dist and no, the dd test does not represent the real world needs :) but I did expected the iops and throughput to be higher on the api
22:08 _dist I assume it's obvious but the tests were conducted from within a guest, it was a pc-015, writing to a virtio raw
22:09 JoeJulian My thought is that maybe your context switch bottleneck isn't being reached with your test.
22:12 _dist That's likely, the single brick test definitely maxed out my disk array at around (1200MB/s read and 400MB/s write) and the replica is maxing 1gb. Is the api speed increase only obvious under certain test types or volume types?
22:13 semiosis infiniband + ssd
22:15 cfeller I have gluster a mount point (fuse) in my fstab, but it doesn't always get mounted on boot - it appears to be because the network connection isn't always active at the time the gluster volume is trying to be mounted.
22:15 cfeller is there a way to ensure that doesn't happen (short of moving the gluster mount into its own init script)?
22:15 cfeller The gluster fuse client is 3.4.1, on Fedora 18.
22:16 semiosis cfeller: add a 'sleep 10' to the top of /sbin/mount.glusterfs?
22:16 JoeJulian cfeller: Do you have _netdev set on your mounts?
22:17 cfeller JoeJulian: yes.  It looks like this:
22:17 cfeller <server>:gv0    /mnt/gluster/gv0 glusterfs defaults,_netdev 0 0
22:18 cfeller semiosis: that could work, but I would have to remember to re-add that line next time I update gluster.  I'd rather just put a custom mount script in rc.local.
22:20 _dist semiosis: so unless I'm pushing 1Gbyte/S (or there abouts) I won't see the difference? I'll do some more complicated tests (multiple simultaneous VMs, etc) and see if I can find a difference. But if the native mount can push 500MB/s/vm I probably won't use the API until it's better incorporated in libvirt gui's interfaces (it's not right now)
22:21 semiosis _dist: consider throughput vs latency
22:21 phox left #gluster
22:21 JoeJulian cfeller: netdev mounts aren't supposed to happen until the network is up. Is this using network.service or NetworkManager.service?
22:23 cfeller JoeJulian: network.service
22:23 cfeller (NetworkManager.service is disabled - I meant to remove it like I normally do on servers)
22:24 _dist semiosis: Ok, I'll setup something like 10VMs on the cluster and run sets of different IO tests on each and see if I can find scenarios where the api outperforms native significantly. Thanks for your help
22:24 semiosis _dist: that's great!  let us know how it goes please
22:24 _dist I will for sure, ttyl
22:25 nueces_ joined #gluster
22:25 _dist left #gluster
22:26 nueces joined #gluster
22:32 elyograg joined #gluster
22:35 geewiz joined #gluster
22:38 cfeller JoeJulian: looking at the logs more closely, my initial assumption this morning was incorrect.
22:38 cfeller I'll blame being undercaffeinated, my bad.
22:38 cfeller The network is starting ahead of time, it looks like it is failing for a different reason.
22:38 cfeller Here is a snippet of messages: http://ur1.ca/g2634
22:38 glusterbot Title: #54951 Fedora Project Pastebin (at ur1.ca)
22:39 cfeller it doesn't always fail though, and the times that it does, ssh'ing in and issuing a "mount -a" corrects the problem.
22:40 aib_233 i keep getting this messages and can't figure out what it means: [2013-11-18 22:25:53.547991] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (127.0.0.1:1016)
22:40 glusterbot aib_233: That's just a spurious message which can be safely ignored.
22:44 cfeller JoeJulian: however, the gluster mount log from the same time, says that there was no route to host: http://ur1.ca/g2645
22:44 glusterbot Title: #54954 Fedora Project Pastebin (at ur1.ca)
22:45 cfeller so my initial undercaffinated observation was correct?
22:45 cfeller so even though systemd thinks networking has started, the connection hasn't completely setup?
22:57 sashko cfeller: you have selinux enabled?
22:57 sashko looks like it's selinux related, when you log in, you are doing it as the root user which has different context that the user doing it during start up
22:58 cfeller yes, selinux is enabled, as it is a webserver.
22:59 cfeller i had to set: setsebool -P httpd_use_fusefs 1
22:59 cfeller to get it to play nice with apache.  should I have to do anything else?
22:59 sashko is this an rpm install?
22:59 cfeller yes.
22:59 sashko wonder why the rpm doesn't set the proper stuff
23:00 sashko are you able to reboot the server?
23:00 cfeller yup.
23:00 sashko disable selinux for a short period
23:00 sashko reboot it and see if that works
23:01 cfeller ...in progress.
23:01 JoeJulian make permissive, not disable. At least if it's permissive you can just setenforce 1 once you're shelled back in.
23:01 cfeller that is what I did.
23:02 sashko or he can just reboot again :)
23:03 JoeJulian inefficient. :P
23:03 cfeller # getenforce
23:03 cfeller Permissive
23:03 cfeller yet it failed again... grr.
23:04 sashko hmm
23:05 sashko this is a problem:
23:05 sashko Nov 18 11:06:59 gatekeeper systemd[1]: Mounting FUSE Control File System...
23:05 sashko Nov 18 11:06:59 gatekeeper mount[933]: Mount failed. Please check the log file for more details.
23:05 sashko Nov 18 11:06:59 gatekeeper systemd[1]: Started The Apache HTTP Server.
23:05 sashko I assume you are using fuse to access gluster?
23:05 cfeller yes, using the fuse client.
23:06 jskinner_ joined #gluster
23:07 JoeJulian I'm not sure why you were thinking selinux, sashko. no route to host seems pretty obvious. Some network chipsets take forever to establish a link, is this dhcp?
23:07 cfeller static IPs
23:08 sashko where do you see no route to host?
23:08 sashko did i miss it?
23:08 cfeller however, the gluster mount log from the same time, says that there was no route to host: http://ur1.ca/g2645
23:08 glusterbot Title: #54954 Fedora Project Pastebin (at ur1.ca)
23:08 cfeller copy/paste for you.  =)
23:09 cfeller JoeJulian, sashko: I'm going to have to leave unfortunately.  I'll leave my IRC client running, and I can try any suggestions you have for me tomorrow.
23:09 sashko oh i missed that log paste
23:10 Guest19728 joined #gluster
23:10 JoeJulian Another lame brute-force method... how about adding the mount option fetch-attempts=600
23:11 sashko :)
23:11 sashko btw I think JoeJulian might be right, your bluster mount happens right after network start
23:11 sashko depending on your infrastructure it could take a while for network to be up
23:11 sashko does bluster have a timeout option via stab, JoeJulian?
23:11 sashko gluster
23:11 sashko fstab
23:12 sashko damn fingers
23:29 _pol joined #gluster
23:37 sprachgenerator joined #gluster
23:38 lbalbalba left #gluster
23:57 ira joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary