Camelia, the Perl 6 bug

IRC log for #gluster-dev, 2013-09-23

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:28 awheeler joined #gluster-dev
03:16 kshlm joined #gluster-dev
03:24 bulde joined #gluster-dev
03:35 shubhendu joined #gluster-dev
03:44 kanagaraj joined #gluster-dev
03:55 itisravi joined #gluster-dev
04:26 ndarshan joined #gluster-dev
04:59 aravindavk joined #gluster-dev
04:59 kanagaraj joined #gluster-dev
05:17 lalatenduM joined #gluster-dev
05:19 bala joined #gluster-dev
05:19 raghu joined #gluster-dev
05:22 bulde joined #gluster-dev
05:31 hagarth joined #gluster-dev
05:38 lalatenduM joined #gluster-dev
05:52 ndarshan joined #gluster-dev
05:52 mohankumar joined #gluster-dev
05:55 hagarth joined #gluster-dev
06:00 ppai joined #gluster-dev
06:12 ndarshan joined #gluster-dev
06:14 vshankar joined #gluster-dev
06:30 hagarth joined #gluster-dev
06:34 an joined #gluster-dev
06:43 shyam joined #gluster-dev
07:36 hagarth joined #gluster-dev
09:21 hagarth joined #gluster-dev
09:40 vshankar joined #gluster-dev
10:20 shubhendu joined #gluster-dev
10:21 ndarshan joined #gluster-dev
10:22 kanagaraj joined #gluster-dev
10:23 hagarth joined #gluster-dev
10:25 aravindavk joined #gluster-dev
10:35 vshankar joined #gluster-dev
11:22 ppai joined #gluster-dev
11:30 kkeithley ndevos: avati fixed it a different way. http://review.gluster.org/5986
11:33 ndevos kkeithley: hmm... thats not following the Fedora packaging guidelines (should use _datarootdir), but oh well
11:42 ppai joined #gluster-dev
11:43 kkeithley I only took a quick glance at it late Friday night after Harshavardhana -2'd my patch set and told me about this one. If this isn't correct wrt packaging guidelines we should fix it for real.
11:48 ndevos well, how importants is it to keep with "strongly encouraged" - https://fedoraproject.org/wiki/Packaging:​Guidelines?rd=Packaging/Guidelines#Macros
12:03 kanagaraj joined #gluster-dev
12:04 shubhendu joined #gluster-dev
12:04 ndarshan joined #gluster-dev
12:05 aravindavk joined #gluster-dev
12:11 bulde1 joined #gluster-dev
12:30 vshankar_ joined #gluster-dev
12:38 bala joined #gluster-dev
12:47 itisravi joined #gluster-dev
12:51 awheeler joined #gluster-dev
12:52 awheeler joined #gluster-dev
12:56 mohankumar joined #gluster-dev
13:00 hagarth joined #gluster-dev
13:23 ndk joined #gluster-dev
13:52 vshankar joined #gluster-dev
13:52 bulde joined #gluster-dev
13:56 mohankumar avati_: around?
14:15 wushudoin joined #gluster-dev
14:57 ababu joined #gluster-dev
15:30 mohankumar joined #gluster-dev
15:32 wushudoin joined #gluster-dev
16:13 aravindavk joined #gluster-dev
17:22 an joined #gluster-dev
17:47 edward1 joined #gluster-dev
17:58 [o__o] left #gluster-dev
17:59 [o__o] joined #gluster-dev
18:10 an joined #gluster-dev
18:32 JoeJulian imo, "strongly encouraged" means unless there's a legitimate reason not to use the macro, use the macro.
19:28 johnmark JoeJulian: +1. we need to be good upstream citizens
20:45 badone joined #gluster-dev
20:50 avati_ foster: ping
20:52 foster avati_: pong
20:53 foster avati_: just replied to the last mail, let me know if we're on the same page
20:55 avati_ ok.. waiting (for email)
20:55 foster basically want to define "nothing" ;)
20:59 avati_ foster: i think there's a small disconnect
20:59 foster ok
21:00 avati_ i was hoping that we can take the current work you have done (for creating files with a backing file) with very minimal sanity enforcement (making backing image read-only by setting an xattr and enforcing read-only in the xlator based on the flag)
21:00 avati_ that should be sufficient for building iscsi on top
21:01 foster well, that's what I was thinking as well
21:01 avati_ so no dentry mucking around?
21:01 foster so that changes the interface from what we discussed
21:02 avati_ correct.. i think just this would make it usable enough for someone to build (say an iscsi target) on top of w/ snapshot capability
21:03 foster by that you mean, somebody else could build the snapshot capability?
21:04 avati_ someone on top will have to do the "close old file, open new(head) file"
21:04 avati_ instead of us doing that in xlator
21:04 foster ok
21:04 avati_ we can place markers in the xattr to aid that
21:05 avati_ (like marking the old backing image read-only, gfid of new head file to continue writing on)
21:05 foster that means somebody else has to create the new file
21:06 avati_ somebody else has to call the setxattr("qcow2:10gb:/backingfile")
21:06 foster on the new file that they have created :)
21:06 avati_ yes
21:06 avati_ correct
21:07 foster so we'll still have to deal with deleting the backing files
21:08 foster i.e., a delete rebases dependent files on depending
21:09 avati_ i think #5967 + enforcing read-only backing-file should be sufficient for what we do.. let everything else be handled by the user.. like creating files
21:09 avati_ in the future we can do that in a new xlator
21:10 avati_ let's not create or delete any file
21:10 avati_ the part we discussed in the previous call, let that be a future work in a new xlator
21:10 foster hmm, ok. I considered this, but cloning a file now blows up any users on the source file
21:12 avati_ foster: it would.. it would not be user consumbale clones.. just low level pieces and primitives for someont to imlement a high level feature (like iscsi with snapshot/clone support)
21:13 avati_ for user consumable clones we would have to provide API similar to btrfs and make it look and feel like btrfs
21:13 avati_ and i think, as you rightly said, that is work for a separate translator
21:14 foster well, I consider the read-only thing as policy of a separate translator as well
21:14 foster if you can delete the backing file, what difference does it make
21:15 avati_ hmm, i suppose making it read-only also means makign sure you cannot delete it:-?
21:15 foster which means we need to track how many clones there are
21:15 avati_ i'm now wondering if even that is necessary, if it is targetted for a specific application (like iscsi)
21:16 foster to know when we can delete it again
21:16 avati_ (i.e enforcing read-only in the FS)
21:16 foster (i.e., policy)
21:16 foster imo, i think we should add a policy translator or not
21:17 avati_ yeah..
21:17 foster to put it another way...
21:17 foster my thought was to still expose the raw mechanisms of qemu-block, even with the policy translator
21:17 foster so a client application could do its own thing if it wanted to
21:17 avati_ ...
21:17 foster hence not wanting to put policy in qemu-block
21:18 foster (but that would be much nicer with ioctl support)
21:18 avati_ makes sense
21:18 foster so that's getting further down the road
21:18 avati_ do you propose implementing a policy xlator now?
21:18 avati_ or in the future?
21:19 foster I'd actually prefer to see the native btrfs/ioctl/reflink stuff before the policy
21:19 avati_ yes
21:19 avati_ i agree
21:19 foster so we can use that as a model and perhaps have it manage both
21:20 foster and iirc, somebody also proposed a native qcow2 translator
21:20 foster so if that came along, just the same
21:20 avati_ right
21:20 foster and then we could consider generic reflink support, possibly using qemu-block or other options, etc.
21:20 avati_ so are we on the same page then: we fix up #5967, fix crashes / alignment issues, and call it done?
21:20 avati_ rest of the stuff is future
21:21 foster i think the alignment issues go away if the use case is block
21:21 avati_ i think qcow2 does not have any implicit alignment "issues"
21:21 foster but otherwise, that sounds reasonable to me as a checkpoint
21:21 avati_ at least none which cant be handled in code
21:22 foster i don't argue that they can't be handled... just that I've hit them and haven't had time to think about it yet ;)
21:22 avati_ in the file-snapshot.t there are unaligned IO happening
21:22 avati_ rather, non-block multiple IO
21:23 avati_ it should work, and there is code in block.c to handle unaligned IO
21:23 foster it might have to do with small files or something
21:23 avati_ there is possibly one bug - we do not filter O_APPEND flag
21:24 foster i have it in my notes, but I didn't track the specific steps, I'd have to look at it again
21:24 avati_ ok
21:24 avati_ maybe we have to just handle file size completely ourselves, independent of qcow2's "formatted" size?
21:25 foster can you append to a formatted file? i thought that failed
21:25 avati_ ah, we always use anonymous fd for qcow2 IO.. ignore my O_APPEND comment
21:26 foster i thought it had fixed size allocation tables
21:26 avati_ what i meant was, setxattr(qcow2:10GB) formats the file, but the file size is still 0, as you write (like a normal file) the file size increases, like a normal file
21:27 avati_ and 10GB would be the max file size supported
21:27 avati_ i was suggesting that when i said "we handle file size completely ourselves"
21:27 foster oh
21:27 avati_ not sure if that is a good or bad idea
21:28 foster yeah, I'd have to think about it
21:28 foster mkfs would certainly complain
21:28 avati_ mkfs?
21:28 foster if I just formatted a 10GB file and wanted to put a filesystem on it
21:29 foster i.e., what a vm will probably do
21:29 avati_ you would still have to do 'truncat -s 10GB filename' first
21:29 avati_ like you would on XFS
21:29 avati_ think of the setxattr() formatting as adding hidden ability
21:30 avati_ you can truncate -s <size> on the filename as long as size is less than the formatted size
21:30 foster ok
21:30 avati_ just a different approach to make it "more normal file friendly"
21:31 avati_ this way very small files should also work smoothly, no?
21:32 foster maybe? imo, small files is a feature that has to be explored and added
21:32 foster maybe its a few minor fixes, maybe not
21:33 foster I mean, regular files (not small files)
21:34 avati_ i think today's model of fixing "file size" == "formatted size" is making this xlator "rigid" and unusable for regular files
21:35 avati_ decoupling formatted size (which can still be remembered) and file size (which starts as 0 initially, maintained separately in another xattr) would make it much more "regular"
21:35 avati_ i think the fact that setxattr() changing the file size is making this xlator "irregular"
21:36 foster makes sense, perhaps that would provide the ability to massage around any issues with I/O
21:36 foster this is also future, no?
21:37 avati_ i would like to get the file size management in as long as it does not take too long
21:37 avati_ and leave just policy for future
21:38 foster well it's going to take a little thought
21:38 avati_ i can imagine 3/4 changes which might be sufficient -
21:38 foster by policy, do we mean all renames/protection/etc.?
21:38 avati_ 1. introduce a new xattr called "file size" which is the value used in iatt_fixup() function everywhere
21:39 avati_ 2. initially "file size" is set to 0 at the time of formatting
21:39 avati_ 3. if truncate() offset is higher than "file size", increase "file size" xattr to new size
21:40 avati_ 4. if truncate() offset is smaller than "file size", decrease "file size" xattr and optionally send discard() to the decreased region
21:40 foster what if you append to a file, truncate down, then up again?
21:40 avati_ 5. if write() extends file size (offset + size > "file size") then extend "file size" xattr
21:41 avati_ of course, ensuring the file size does not grow bigger than formatted size at all the above steps
21:41 avati_ appending to file = write to file (except dont trust the offset on which write is arriving, seek to latest EOF)
21:41 avati_ don't think we need to truncate anything during append?
21:43 foster i mean, we'd have to zero out truncated regions
21:43 avati_ should we?
21:43 foster otherwise you expose stale data, no?
21:43 avati_ not unless we limit read() output to the file size
21:44 foster until you truncate extend the file
21:44 avati_ qcow2 takes care of that already..
21:44 avati_ no?
21:44 foster but we aren't using qcow2 here
21:44 avati_ ??
21:44 foster aren't we just changing our definition of the file size?
21:45 avati_ we are, but we are still working on top of qcow2 right?
21:45 avati_ should we get on a call? :)
21:45 foster what do you mean by "don't trust the offset on which the write is arriving?"
21:45 avati_ ah
21:45 avati_ so write() arrives from fuse with a particular offset
21:45 foster sure
21:45 avati_ if the fd was opened with O_APPEND, we should not trust that offset
21:46 avati_ just assume the offset = current "file size"
21:46 avati_ and update "file size" to "file size" + wirte size
21:46 foster oh, why is that necessary?
21:46 avati_ because we need to ignore lseek() on O_APPEND fd
21:47 avati_ but fuse would just present the new offset to us
21:47 avati_ 99% of the time, the offset presented by fuse will match "file size" anyways
21:47 foster shouldn't the offset be the size of the file? why wouldn't that be an error
21:48 avati_ if fd was opened with O_APPEND, offset should *always* be the latest size of the file.. we just need to enforce that
21:48 avati_ so far posix has been enforcing that for us already
21:48 avati_ but now with qcow2 it becomes our responsibility
21:49 foster makes sense, the way I read your statement though was that if somebody seeked to the middle of the file and wrote, we'd append that data
21:49 foster when we should return an error
21:50 avati_ i don't think so.. we should silently append.. that's what xfs/ext4 does too
21:50 foster oh, really? I'll have to try that then I guess
21:50 foster if that's the posix defined behavior then that makes sense
21:50 avati_ O_APPEND means "ignore user specified offset" and use the latest file size as offset
21:51 foster my question before was 1.) append to the file up to 10GB 2.) truncate to 0 3.) truncate to 10GB 4.) read the file, expect zeroes
21:51 avati_ the user might lseek() around to read different parts of the file, but writes should strictly got to the end
21:51 foster gotcha, I'll read up on that
21:52 avati_ foster: i think for your sequence of steps qcow2 should already handle it if we send a discard() to the truncated region
21:52 avati_ qemu has discard() fop
21:52 avati_ we should just call it i think
21:52 foster ok
21:53 avati_ assuming discard() does the right thing,handling file size by ourselves does not look overly complex, right?
21:54 avati_ if discard() is not guaranteeing zero filled reads in the future, then let's ignore the whole "manage file size completely ourselves" exercise
21:54 foster it looks like qcow2 synchronously deallocates those blocks
21:54 foster (on first glance)
21:54 avati_ yes, that's what i remembered too
21:55 avati_ so i was imagining discard()ing truncated regions will work for us
21:56 foster so the user creates a small file, we clone it and format the "exception space" to what?
21:56 avati_ the user creates a file, optionally formats() it (with the maximum expected file size) and continues treating it like a regular file
21:57 foster ok, the user decides
21:57 avati_ we can optionally set a big value, like 1TB by default?
21:58 avati_ when formatting with a backing file, we inherit the backing file's formatted size ourselves
21:58 foster eh, maybe allow the user to configure a default size if none is provided
21:58 foster (format size)
21:58 avati_ yes, we can do that
21:59 avati_ we can also optionally allow pre-formatting *all* files, which would be a bad idea if the user stores lots of small files, but probably OK otherwise
21:59 avati_ pre-formatting all files is clearly a future feature, not for now
22:01 foster is consistency an issue, between the file size xattr and the real size?
22:01 foster or i guess, against extending writes
22:01 avati_ as long as xattr is updated after performing the write we should be OK
22:02 avati_ (that way on a crash we won't have a larger file size without data written)
22:02 foster i don't think that's a guarantee :/
22:03 avati_ we can keep the extended file size in memory up to date in real time, and issue flush/fsync on backing file before performing setxattr?
22:03 avati_ in a delayed manner?
22:04 foster the fsync might be necessary
22:05 foster not sure how much delay is sufficient and doesn't kill performance
22:05 avati_ let's get the correctness right, and target this for large files / VMs where extending writes are minimum
22:06 avati_ so
22:06 foster ok
22:06 avati_ do you think "managinf file size completley ourselves" is something which can be done over the next few days? orshould we shelve the idea for now?
22:07 foster i can try it, seems reasonable enough.
22:07 avati_ note: we would need per internal-snapshot file size value (per internal-snapshot xattr)
22:08 avati_ stored separately, and reset the current file size xattr on every "goto" command..
22:08 foster we can see if anything crops up
22:08 foster shouldn't a snapshot be the same size as the origin?
22:08 avati_ similarly capture current size into a new xattr on "craete" command
22:08 avati_ i don't think so.. why should it?
22:09 foster so you create vm image, snap it
22:09 foster wouldn't you want to read-only mount it for a backup or something?
22:09 avati_ with internal snaps that has never been possibly anyways
22:10 foster eh, right. so I want to go back in time because my vm exploded
22:11 avati_ which is why you need to preserve the file size at the time of snapping into the per-snap xattr, so that you can restore the size when you "goto" that snapshot
22:11 avati_ 16:05 < avati_> similarly capture current size into a new xattr on "craete" command
22:13 foster ok, that's what I meant by matching the origin
22:13 avati_ ah, i see..
22:13 foster i mistook what you meant by reset the size
22:13 foster makes sense
22:14 avati_ so here are two possibly outcomes -
22:14 avati_ 1. fix up #5967 and crashes, call it done
22:14 avati_ 2. fix up #5967 and crashes, handle file size ourselves, call it done
22:15 avati_ depending on how you see fit, we pick either one?
22:15 foster i don't like calling it "done" ;) , but yeah, the file size thing can come as a separate patch
22:15 foster I guess I can throw away the half-baked stuff I have right now
22:16 avati_ by calling it done == point where we can pick up next project
22:16 foster yeah
22:17 avati_ cool, now i think we are on the same page :p
22:17 foster yep, sounds good
22:18 avati_ i'm assuing you "half backed stuff" is the same thing which would become the policy xlator eventually?
22:19 foster yeah, for the most part

| Channels | #gluster-dev index | Today | | Search | Google Search | Plain-Text | summary