Perl 6 - the future is here, just unevenly distributed

IRC log for #gluster, 2016-04-30

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:00 ctria joined #gluster
00:01 betheynyx joined #gluster
00:12 wushudoin joined #gluster
00:16 RameshN_ joined #gluster
00:21 bennyturns joined #gluster
00:59 dgandhi joined #gluster
00:59 dgandhi joined #gluster
01:02 dgandhi joined #gluster
01:04 EinstCrazy joined #gluster
01:16 EinstCra_ joined #gluster
01:19 harish_ joined #gluster
01:24 EinstCrazy joined #gluster
02:02 hagarth joined #gluster
02:07 The_Pugilist joined #gluster
02:08 hackman joined #gluster
02:11 DV joined #gluster
02:58 EinstCrazy joined #gluster
03:02 RameshN_ joined #gluster
03:18 betheynyx joined #gluster
03:18 plarsen joined #gluster
03:21 EinstCrazy joined #gluster
03:25 aravindavk joined #gluster
03:39 nishanth joined #gluster
04:24 natarej joined #gluster
04:38 spalai joined #gluster
04:45 shubhendu joined #gluster
04:48 sage joined #gluster
04:54 EinstCrazy joined #gluster
04:55 plarsen joined #gluster
04:55 EinstCra_ joined #gluster
04:58 poornimag joined #gluster
04:58 aravindavk joined #gluster
05:09 harish_ joined #gluster
05:11 EinstCrazy joined #gluster
05:17 EinstCrazy joined #gluster
05:20 EinstCra_ joined #gluster
05:30 EinstCrazy joined #gluster
05:32 EinstCrazy joined #gluster
05:33 ramky joined #gluster
05:36 EinstCrazy joined #gluster
05:46 ndarshan joined #gluster
05:54 aravindavk joined #gluster
05:54 EinstCrazy joined #gluster
06:10 kdhananjay joined #gluster
06:13 nishanth joined #gluster
06:17 karthik___ joined #gluster
06:21 EinstCrazy joined #gluster
06:26 EinstCrazy joined #gluster
06:29 JesperA- joined #gluster
06:42 atinm joined #gluster
06:43 ctria joined #gluster
06:43 EinstCrazy joined #gluster
06:50 EinstCra_ joined #gluster
07:00 kovshenin joined #gluster
07:03 EinstCrazy joined #gluster
07:12 Wizek joined #gluster
07:22 rouven joined #gluster
07:45 Wizek joined #gluster
07:49 EinstCrazy joined #gluster
07:50 EinstCra_ joined #gluster
07:56 mhulsman joined #gluster
08:16 EinstCrazy joined #gluster
08:28 karnan joined #gluster
08:30 shubhendu joined #gluster
08:39 EinstCrazy joined #gluster
08:53 betheynyx joined #gluster
09:05 MikeLupe joined #gluster
09:18 EinstCrazy joined #gluster
09:21 atalur joined #gluster
09:23 EinstCrazy joined #gluster
09:24 EinstCrazy joined #gluster
09:44 shubhendu joined #gluster
10:24 EinstCrazy joined #gluster
10:42 nishanth joined #gluster
10:55 Lee1092 joined #gluster
11:22 mhulsman joined #gluster
11:29 mowntan joined #gluster
11:47 haomaiwang joined #gluster
11:54 Gnomethrower joined #gluster
12:01 haomaiwang joined #gluster
12:08 Logos01 joined #gluster
12:14 bowhunter joined #gluster
12:41 natarej joined #gluster
12:45 level7 joined #gluster
12:51 EinstCrazy joined #gluster
12:56 EinstCrazy joined #gluster
13:01 haomaiwang joined #gluster
13:04 spalai left #gluster
13:07 nbalacha joined #gluster
13:14 betheynyx joined #gluster
13:16 EinstCrazy joined #gluster
13:19 rafi joined #gluster
13:24 johnmilton joined #gluster
13:27 chirino_m joined #gluster
13:28 skoduri joined #gluster
13:48 plarsen joined #gluster
13:50 mowntan joined #gluster
13:50 mowntan joined #gluster
14:01 haomaiwang joined #gluster
14:06 chirino joined #gluster
14:08 mhulsman joined #gluster
14:08 EinstCrazy joined #gluster
14:26 Gnomethrower joined #gluster
14:31 hackman joined #gluster
14:38 bennyturns joined #gluster
14:54 shyam joined #gluster
15:01 haomaiwang joined #gluster
15:02 level7_ joined #gluster
15:07 MikeLupe joined #gluster
15:25 kovshenin joined #gluster
15:26 shubhendu joined #gluster
15:30 vmallika joined #gluster
15:30 Jiffin joined #gluster
16:01 haomaiwang joined #gluster
16:01 Gnomethrower joined #gluster
16:09 ctria joined #gluster
16:11 spalai joined #gluster
16:13 vmallika joined #gluster
16:42 primehaxor joined #gluster
16:50 natarej joined #gluster
16:57 Chaot_s joined #gluster
17:00 Chaot_s Hi all, short question: if i have 2 (unballanced)bricks and say 4gb of free space on one and 2 on the other. what happens if i write a 3Gb file? will dht decide to place it on the brick with 4Gb free space? even if it should end up on the otherr brick?
17:01 haomaiwang joined #gluster
17:03 spalai Chaot_s: dht can not guarantee that
17:04 Chaot_s guess i'm going to fire up a pair of vm's to run the test then
17:05 kotreshhr joined #gluster
17:05 kotreshhr left #gluster
17:08 primehaxor joined #gluster
17:12 _nex_ joined #gluster
17:17 _nex_ joined #gluster
17:17 spalai Chaot_s: so the 3gb file can hash to any one of the brick. And if it hashes to the 2gb brick, then  ultimately the write request will fail with ENOSPC
17:17 spalai does  this answer your question.
17:18 Chaot_s i guess it does. then that means the min free space limit needs to be set to de largest file to be written i guess :)
17:18 post-factum Chaot_s: consider using sharding
17:18 post-factum Chaot_s: that will split your big file across the bricks
17:19 spalai Chaot_s: absolutely right
17:20 Chaot_s i'm investigating replacing a single san with a solution for 2Pb the max filesize we expect to be written is in the ~3Tb range.
17:20 spalai Chaot_s:  as post-factum suggested, sharding is the right setup
17:22 _nex_ joined #gluster
17:22 spalai +
17:24 spalai Chaot_s: setting min-free-disk is always tricky
17:24 spalai you need to know the biggest file size that you are going to write
17:26 Chaot_s its meant for end users, and we do not really limit them in usage. so every now and then one of the people may just think... lets backup my pc and toss in a 3Tb file. though with user you never know
17:27 _nex_ joined #gluster
17:27 Chaot_s thats why i'm looking in to what happens when stuppid strikes.
17:28 spalai That's where sharding comes in to picture. If the file size are  on the bigger size bucket as in your case,  sharding should be used
17:29 Chaot_s is sharding production safe then? i read that it is still more on the expirimental side
17:38 dlambrig_ joined #gluster
17:54 betheynyx joined #gluster
18:01 haomaiwang joined #gluster
18:01 spalai left #gluster
18:26 atalur joined #gluster
18:30 karnan joined #gluster
18:48 kovshenin joined #gluster
18:58 Peppard joined #gluster
19:01 haomaiwang joined #gluster
19:10 post-factum Chaot_s: it is considered safe
19:13 level7 joined #gluster
19:17 level7_ joined #gluster
19:32 Peppard joined #gluster
19:42 spalai joined #gluster
19:42 spalai left #gluster
20:01 haomaiwang joined #gluster
20:05 jiffin joined #gluster
20:07 jiffin1 joined #gluster
20:12 jiffin joined #gluster
20:14 Chaot_s post-factum: thanx, building another test setup to check it out :)
20:25 plarsen joined #gluster
20:26 jiffin1 joined #gluster
20:51 level7 joined #gluster
21:01 haomaiwang joined #gluster
21:07 MikeLupe joined #gluster
21:37 vmallika joined #gluster
21:48 Chaot_s Are there any hints on how to find an optimal shard size?
21:51 post-factum what is your average file soze?
21:51 post-factum s/soze/size/
21:51 glusterbot What post-factum meant to say was: what is your average file size?
21:52 Chaot_s ~4 - 8Mb
21:53 post-factum hm
21:53 post-factum then why do you want sharding :)?
21:54 Chaot_s hehehe thats my private use / testing. and indeed no sharding needs there. for the work environment its unknown. its on my list if things to find out :)
21:55 post-factum then i guess you should stick to this:
21:55 post-factum shard_size = avg_file_size * 2 / distributed_bricks_count
21:56 Chaot_s its a real mixed environment. there are researchers that have source files in the 500Gb range, and they take those appart in sometimes tiny pieces of 1M each, others just read those and generate a single ~1Gb file
21:56 post-factum no, i'll refine my formula
21:56 jiffin1 joined #gluster
21:57 post-factum shard_size = avg_file_size / (2 * distributed_bricks_count)
21:58 post-factum probably, you would like to separte different types of files across different volumes with different shard size
21:58 post-factum *seprate
21:58 post-factum *separate
21:58 post-factum damn
21:58 Chaot_s in a perfect world i would love to do just that.
21:58 hchiramm joined #gluster
21:59 Chaot_s that is the problem for me, there are lots of different researchers, they at random work, so there is no know when, where and what. just make sure its big and stable for log time :)
22:00 post-factum just find out, what is the most usable file size, and then calculate shard size for it
22:01 Chaot_s i would love to be able to have some sort of filtering that would be like the shard solution only for files that are say >50Gb
22:01 haomaiwang joined #gluster
22:02 post-factum mm no, you would want to apply sharding to 49G as well
22:04 betheynyx joined #gluster
22:04 post-factum the reason for sharding is not only to spread the file across bricks
22:04 post-factum it is also about healing granularity
22:09 Chaot_s would also help when adding more nodes to the network / group
22:11 Chaot_s ballancing out smaller shard pieces may just be better and ballance the volume with less long large files at once. instead of longer connections turning in to "hot zones"
22:13 post-factum precisely
22:14 Chaot_s guess that would also help with read cache...
22:16 Chaot_s it happens that lots of paralel read requests exist for the same file (and most of the time its just a few lines out of the 500Gb file) happens a lot when they try matching up segments of (files with calculations / simulations)
22:16 Chaot_s does the gluster host do read cache for the bricks it hosts?
22:17 post-factum you mean, server-side cache?
22:22 Chaot_s the server that hosts the filesystem with the bricks. cacheing a 500G file is no use or turns very expensive fast. so i never looked at it that way. sharding those files in to say 256M / 512M chunks would benefit the hot zones in those files if the server hosting the bricks has enough memmory and is able to read-cache the brick's reads
22:23 post-factum Linux pagecache does the trick
22:23 Chaot_s that changes my story with sharding big time.. until now i have counted just the raw r/w for that event.
22:27 Chaot_s still learning and orienting gluster in more detail. the target at work would be 2x 1Pb if possible linked via a single 40G link (primary and backup location)
22:27 post-factum what is the RTT between two locations?
22:27 Chaot_s its not existent yet
22:27 post-factum what is the distance?
22:29 Chaot_s less then 10Km
22:29 Chaot_s make it 20Km
22:29 post-factum then you should be fine with synchronous replica 2
22:30 Chaot_s the idea i have is to make a part replica and a part manual replication / sync
22:32 post-factum manual? why?
22:35 Chaot_s not really sure, the current idea was that not everything needs to exist on both sides, so its not really needed to be synced. although i guess it would be just a headache less if it would just exist on both sides.
22:37 Chaot_s just add a layer with "snapshots", be sure replication is working and be done with multiple backups i guess.
22:40 post-factum yup
22:41 post-factum not sure about snapshots, though
22:42 fcami joined #gluster
22:44 Chaot_s guess wrong wording on my side :). i meant something like: http://www.gluster.org/community/documentation/index.php/Features/File_Snapshot
22:45 post-factum not sure that is the thing you are looking for
22:45 dlambrig_ joined #gluster
22:45 post-factum according to http://review.gluster.org/#/c/5367/
22:45 glusterbot Title: Gerrit Code Review (at review.gluster.org)
22:45 post-factum that is for qcow2 images
22:47 Chaot_s i guess not indeed, though i guess there is some option that allows for a sort of volume snapshot. not sure what i need for that.
22:48 post-factum volume snapshots are based on underlying LVM feature, AFAIK
22:48 post-factum consider that is you really want to make use of them
22:48 post-factum *if
22:51 Chaot_s i'm actually not sure how to go and handle backups yet. never had to backup this ammount of data before. the current procedure is mostly manual rsync's or images.
22:51 Chaot_s those get in to tapes that are rotated (robot)
22:54 Chaot_s so yeah, i have no idea yet on how to make a automated and consistent backup of such ammounts of data.
22:56 Chaot_s and yes that is a bad feeling :)
23:01 haomaiwang joined #gluster
23:05 post-factum we use separate gluster cluster to store backups
23:05 post-factum doing backups by reading directly from bricks
23:05 post-factum in that case you won't be abl to do rsync because volume is spread across different bricks
23:05 post-factum *able
23:06 post-factum on the other hand, you could script that easily, and backup only what you really need
23:10 post-factum btw, the reason gluster bricks could be read directly is the major feature
23:11 post-factum in cephfs, for example, i have no idea how to handle backup of terabytes of data that fast
23:11 Chaot_s Yeah its really a point of debate still. some data is just really temporary and is gone in a few days. purely needed for staging large simulations. though its spikes of traffic and space used. the backup effort vs its value is a point there.
23:13 Chaot_s there are some vm's that will be stored on there too, though they are not that much. i guess that if there are going to be 30 maybe 40 its the max. or if its working well it will become a new feature :)
23:15 Chaot_s as said, still a lot of things to wory abbout, first thing is storage and leave the current backup routines as they are. if cluster backups are an option somehow, they will be implemented in a later stage i guess.
23:15 post-factum for vm, i'd go with ceph
23:16 dlambrig_ joined #gluster
23:18 Chaot_s been looking in to that too, the main reason i prefer gluster for the storage end is that it seems less fault prone. i would guess that in the worst of the worst it would een still be possible to harvest the bricks for some sort of recovery work
23:18 Chaot_s *even
23:21 post-factum that is true, and that is why we export vm images from ceph rbd to gluster every week :)
23:21 post-factum as a backup
23:22 Chaot_s hehehe seems reasonable :)
23:23 Chaot_s http://rajesh-joseph.blogspot.nl/p/gluster-volume-snapshot-howto.html what is this???... seems interesting :)
23:23 glusterbot Title: A Journey: Gluster Volume Snapshot Howto (at rajesh-joseph.blogspot.nl)
23:25 Wizek joined #gluster
23:25 Chaot_s if i'm reading that right, it would be as easy as a snapshot and a backup of that snapshot. depending on how fast that snapshot can be stored away somewhere and somehow :)
23:26 post-factum yup, LVM snapshots are slow
23:28 mowntan joined #gluster
23:28 Chaot_s seems dodgy somehow :)
23:29 Chaot_s back to what i was doing i guess. see how gluster could replace our current san based storages.
23:31 Chaot_s also still looking in to what hardware would be suitable for the setup.
23:40 Chaot_s its strange how so many simulations / calculations just just the simplest tools like cat grep and some regexp's
23:43 Chaot_s it doesn't scale... and most of them dont even understand objects... and they dare to say it could be done on a old home pc back in the 80's... yeah. scale a 1Mb file to 500G+ of 80 char lines. and then do the math again :)
23:44 Chaot_s whish i had a inflatable representation of 1Mb vs 500G
23:44 Chaot_s just to show scale :)
23:49 MikeLupe Hi - Does anyone have experience with importing to ESD with import-to-ovirt.pl ?
23:50 MikeLupe Maybe I'm fortunate here, #ovirt ist ~dead
23:56 MikeLupe I've just found the solution to my failing import - just get the updated script ;)
23:57 Wizek joined #gluster

| Channels | #gluster index | Today | | Search | Google Search | Plain-Text | summary