Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2016-06-24

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:29 cognominal joined #moarvm
01:32 timotimo hm, we currently don't measure MVMString allocations at all in the profiler. i wonder how we could do that properly
01:48 ilbot3 joined #moarvm
01:48 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
02:13 lizmat joined #moarvm
05:29 lizmat joined #moarvm
06:46 domidumont joined #moarvm
06:51 domidumont joined #moarvm
07:54 lizmat joined #moarvm
09:48 jnthn timotimo: Just stick in allocation logging instructions after all the string ops would be a start.
12:08 brrt joined #moarvm
12:55 brrt ehm, how much worth my while does any of you think it to be for me to refactor spesh allocation out of spesh into its own thing
12:57 jnthn Well, are you going to use it for something else? :)
12:59 brrt the jit more or less wants its own thing, because it has much shorter lifetime than the spesh graph
12:59 brrt to be preciese, 'structures generated by the compiler tend to live much shorter than the spesh graph'
13:00 brrt arguably not the biggest worry
13:01 timotimo what i've seen for "extremely short lifetime" is having a circular buffer that gets cleared per-frame in OpenGL-like stuff; in particular for the nintendo 3ds
13:02 brrt doesn't work, because i can't predict how much memory i'll need
13:03 brrt circular buffers are / can be cute, though
13:03 timotimo well, if your current memory needs exceed a single buffer, build a second one
13:03 timotimo if your lifetime is *really* short, you can use the stack :)
13:21 jnthn brrt: Hm, don't we typically discard the spesh graph right after the JIT?
13:21 brrt hmmm
13:21 brrt true
13:21 jnthn brrt: Or are you saying you only need the memory for one phase of the JIT?
13:21 brrt ehm, well, ehm,
13:21 brrt let me think about the correct answer
13:21 timotimo we're still keeping memory around a bit for logging and such
13:21 jnthn If yes, you do get a win in terms of lower memory overhead but you have to be darn careful you don't get pointers in the wrong direction between the lifetime'd regions.
13:21 jnthn timotimo: Yes, but we JIT after that :)
13:22 timotimo whereas we enter jit, use the memory, leave jit and kick it out immediately, i'd say
13:22 jnthn That is, at the logging phase we're still interpreting.
13:22 brrt the thing i'm thinking about is the tile list and the value descriptors
13:22 jnthn Then we spesh based on the logged stuff, then we JIT.
13:22 brrt those are per-basic block
13:22 timotimo right
13:22 brrt at best
13:23 brrt you'll never have a pointer to any of these from the spesh graph
13:29 zakharyas joined #moarvm
13:30 brrt its probably not important enough; my second consideration was 'not have to walk through three object layers in order to get at the spesh pool'
13:30 timotimo we could just pass the spesh pool along on the stack if we're so worried about performance
13:31 timotimo or hope that the c compiler and/or cpu caches will make it work fine for us
13:33 brrt i'm not
13:33 brrt i'm worried about convenience
13:33 brrt performance, what, me worry?
13:34 timotimo :D
13:34 timotimo well, we can still have macros
13:35 brrt ok,ok, i'll do something useful instead
13:35 brrt ... whenever i actually have time
13:37 timotimo i was wondering: with the short-string-cache, we could compile things like substr when we know the argument for length is 1 to immediately go through the cache in the jitted code and only hit the C function if we know the cache isn't hit
13:37 timotimo but that'd also mean we'll hit the cache twice. though the cache will already be in the cache :P
13:37 timotimo hm, except, substr has to go through the grapheme iterator
13:39 timotimo it could work for chr, though
13:39 brrt better answer
13:39 brrt we can transform, at spesh or preferably JIT time, the substr, etc. ops into 'low level string operations code'
13:40 brrt these we can JIT fast
13:40 timotimo hm, you're suggesting we have some basic operations like "gimme a grapheme iterator", "destroy the grapheme iterator", "advance the iterator", ...
13:41 brrt kind of, yes
13:41 timotimo that's not bad
13:41 brrt it's a lot of work
13:41 timotimo with the expr jit that works much simpler than with current spesh
13:42 brrt right
13:42 brrt (i may want to add a LOOP primitive, just to make it more like LISP)
13:42 timotimo :D
13:43 brrt actually, there are other, better reasons for it, but it is quite odd
13:43 timotimo fair enough, yeah
13:43 brrt the better reasons are that we'd like to have explicit which variables are updated in the loop, so that we can take them into account
13:44 timotimo oh?
13:44 timotimo is that for certain optimization techniques?
13:44 timotimo or just better compilation or something?
13:44 brrt in this case, to keep the tree structure meaningful
13:44 brrt e.g. suppose we have a loop that updates two variables
13:45 brrt in lisp, that would be something like
13:46 brrt (loop ((x 1 (+ x 1)) (y 10 (- y 1)) (z 0 (+ z (* x y)))) (> x y))
13:46 brrt something like that.. actually, there is supposed to be a body in there
13:47 brrt point is, the loop terminates when x is greater than y
13:47 brrt which node, in this case, defines x on output
13:48 brrt actually, i'm not sure that makes a lot of sense
13:48 brrt i'd think a LOOP would be void-valued in most cases
13:49 brrt better point: expression language doesn't support direct assignment to either x or y
13:49 timotimo i'm not sure i follow
13:50 brrt no, i'm not sure i do either
13:50 brrt :-)
13:50 timotimo you don't mean how the ((x 1 ...)) part declares a starting value for x?
13:50 brrt my point is basically this: the expresson 'tree' forms a DAG, right?
13:50 timotimo right, you had that point in the past
13:51 timotimo and how things have to get replicated so that you can, for example, refer to the same value twice
13:51 brrt uhuh
13:51 brrt well, suppose i have a loop that calls do_foo (x,y) repeatedly
13:52 brrt i'll need to have something that refers to the notion of x and y inside the loop
13:52 timotimo right
13:52 brrt because they are not just their original values anymore
13:52 brrt the context of this was 'it would be cool if the expression jit could deal with multiple-basic block regions, preferably hot loops'
13:53 brrt low level string ops would fall into that category
13:53 timotimo mhm
13:53 brrt the answer in this case was 'we should have a looping structure to keep track of the changes'
14:01 timotimo afl has certainly found a crapton of crashes, claiming a whole bunch of them are unique
14:02 timotimo but i suppose many still fall into the same category either way
14:03 timotimo m: say ^593 .pick(3)
14:03 camelia rakudo-moar 9b579d: OUTPUT«(273 436 151)␤»
14:04 timotimo those are the crashes from the S1 crashes folder :)
14:08 timotimo a bunch of invalid writes of size 1 inside MVM_bytecode_finish_frame
14:09 timotimo also, a calloc of size 0
14:13 timotimo this next one isn't as interesting, i bet. invalit write of size 8 coming from arg_s, so it's probably just a far-outta-range argument to arg_s
14:13 jnthn Hm, though the validator should really catch those
14:14 lizmat joined #moarvm
14:15 timotimo yeah, i don't think it has code for that yet, though
14:15 timotimo and just looking at it from afar tells me it's annoying to write code in it
14:17 jnthn urgh, hottest day this year
14:18 timotimo hopefully "ever", not just "so far" :)
14:18 jnthn Well, tomorrow - if the forecasts hold up - it'll break.
14:18 jnthn (Thunderstorms forecast)
14:18 jnthn Wouldn't surprise me if it hits this temperature again later on in the year.
14:19 timotimo probably :|
14:19 * jnthn wonders if it's time to get the fan
14:19 timotimo just yesterday i got a link to a good AC unit and i was surprised how relatively speaking it was kind of cheap
14:20 timotimo but i guess it's also expensive in the "eats all your electrons" way
14:22 timotimo or rather: brakes the wobbling of all the electrons between you and the nearest power plant
14:22 jnthn They all need some way to stick a pipe to outside, though, I think?
14:23 timotimo yeah
14:23 timotimo the same page that was on also offers a ... well, it looks like a huge sock you put over the window and the hose goes through that
14:27 timotimo i have no idea how it's supposed to work :)
14:27 timotimo glue it onto the window frame, perhaps
14:28 jnthn The fan is deployed :)
14:37 jnthn Hmm...interesting memory corruption is interesting...
14:42 jnthn Smells like GC
14:42 timotimo "EU now has 1 GB of free space"
14:47 brrt joined #moarvm
14:47 brrt i should have studied so much more math when i had the chance...
14:50 jnthn Duh, found it
14:52 dalek MoarVM/new-multi-cache: 621fe27 | jnthn++ | src/6model/reprs/MVMMultiCache.c:
14:52 dalek MoarVM/new-multi-cache: Add missing MVM_ASSIGN_REF.
14:52 dalek MoarVM/new-multi-cache:
14:52 dalek MoarVM/new-multi-cache: Fixes missing write barriers when assigning into a multi-cache, which
14:52 dalek MoarVM/new-multi-cache: caused various crashes.
14:52 dalek MoarVM/new-multi-cache: review: https://github.com/MoarVM/MoarVM/commit/621fe276a5
14:53 brrt jnthn++
14:53 jnthn Hopefully spectest comes out a bit better now :)
14:57 jnthn Yup :)
14:59 nwc10 ASAN's verdict might still be "a dead mouse on your carpet"
14:59 nwc10 (it's running)
15:01 jnthn Maybe, though if it is then the issue is well hidden
15:01 jnthn The spectest run had both FSA and GC debugging on
15:01 jnthn OK, so...we have a new multi-dispatch cache :)
15:02 jnthn Which knows about nameds, in theory :)
15:02 timotimo does it already make that benchmark liz recently showed a bunch of times faster?
15:02 jnthn No
15:02 jnthn Because Rakudo's multi-dispatch code doesn't yet take advantage of it.
15:03 jnthn It assumes it can't install things with named args into the cache.
15:03 lizmat jnthn: where would that need to be fix?  in src/Perl6 ?
15:03 timotimo ah, ok
15:03 timotimo might be in the BOOTSTRAP or near that
15:03 jnthn lizmat: Yeah, src/Perl6/Metamodel/BOOTSTRAP.nqp or so
15:03 jnthn lizmat: I need to look at the code there carefully 'cus it's been a little while
15:04 lizmat stuff like bind_ons_param and so ?
15:04 lizmat *one
15:04 jnthn No, that's sig binding
15:04 jnthn Closer to find_best_dispatchee
15:04 lizmat ah, ok
15:04 jnthn lizmat: I *think* it'll need to tease apart "needs a bind check just to validate nameds" from "needs a bind check because of unpacks or constraints"
15:05 jnthn Which I believe are conflated at the moment
15:05 timotimo oh, could find_best_dispatchee run super crazy often in many of my tests just because nameds are involved?
15:05 lizmat timotimo: yes
15:05 timotimo i should have known :)
15:05 jnthn Quite possibly. I mean, @a[$foo]:exists is the classic example
15:05 jnthn And what triggered me to do something about it.
15:05 timotimo i have the feeling nobody bothered to tell me about that :P
15:06 jnthn Well, it's about to change, so... :P
15:06 timotimo yay :)
15:06 jnthn The new cache is kinda interesting.
15:06 jnthn It's structured as a tree rather than an array of array of type tuples
15:07 jnthn So it should be a bit lighter on memory, and a bit faster to search in
15:07 timotimo very cool
15:07 jnthn It hashes on the memory address of the interned callsite
15:08 jnthn To find the tree top
15:08 jnthn It also doesn't have a size limit.
15:09 timotimo ooooh
15:09 jnthn Which is a trade-off :)
15:09 timotimo right, we still don't have a clue how to clear those out
15:09 jnthn Well, one idea is just to say "if it gets huge, throw it away"
15:09 timotimo right
15:10 timotimo only the ones that show up often will get back in, then
15:10 jnthn And it'll be reconstructed with stuff the program is currently interested in.
15:10 jnthn It suffers a bit on megamorphs.
15:10 timotimo yeah
15:10 jnthn Though no worse than what it replaces.
15:10 timotimo surely
15:10 jnthn "in theory"
15:10 jnthn :)
15:10 jnthn In practice, not sure :)
15:10 timotimo we don't have diagnostics in place yet
15:11 jnthn Well, I added two
15:11 timotimo Oh, ok!
15:11 jnthn Though they're #define'd things
15:11 timotimo that's fine in my opinion
15:11 jnthn One is for dumping the cache on each add to see what on earth is in there.
15:11 jnthn Well, more like, to see the tree structure :)
15:11 timotimo oh, that's probably a bit noisy
15:11 jnthn The other just points out whenever the cache size hits a power of 2 size
15:12 jnthn Starting at 32. So 32 entries, 64 entries, 128 entries, etc.
15:12 timotimo so, how does it look on the global scale? we have a single tree that holds everything?
15:13 jnthn No, it's still one cache per proto
15:13 jnthn So they are GC-able in that sense.
15:14 timotimo ah, good
15:14 jnthn The data structure is documetned in the MVMMultiCache.h :)
15:18 timotimo ah, this is an interesting design, like a tree of ops
15:19 timotimo and the way it gets safepoint-freed
15:27 nwc10 jnthn: ASAN tolerates your code. No mouse for you! :-)
15:27 jnthn Phew!
15:28 jnthn Don't want to be full for the karahi chicken I'll hopefully find the energy to make in this heat :)
15:28 nwc10 anyway, "lack of mouse" is pretty cool
15:28 nwc10 (er, sorry, not literally)
15:28 nwc10 jnthn++
15:28 nwc10 I think you've nailed it. At least, at the MoarVM level
15:29 jnthn :)
15:30 jnthn Seems we get a *very* tiny improvement on the %h<a>:exists case simply out of the new cache saying "no" faster
15:30 jnthn And so faster failing over
15:33 jnthn Hm, turns out the simplest possible Rakudo patch isn't quite enough.
15:35 jnthn I want to rest and make dinner, but https://gist.github.com/jnthn/b1b1a569c930442a22f3ad2b17a29edc is what I tried if anyone fancies figuring out why that doesn't work out
15:35 jnthn (The reason could be nearly anywhere)
15:35 jnthn afk for now
16:18 domidumont joined #moarvm
16:41 lizmat joined #moarvm
17:31 brrt joined #moarvm
17:31 brrt yes, the tree structure is cool
17:32 brrt hurray for trees-in-arrays
17:37 harrow joined #moarvm
17:41 brrt joined #moarvm
17:47 brrt stupid hot weather though
17:55 timotimo yeah. with a bit of moisture and absolutely still-standing air .. it's not nice
18:42 brrt joined #moarvm
19:09 FROGGS joined #moarvm
19:09 lizmat joined #moarvm
19:23 zakharyas joined #moarvm
20:11 brrt damnit, damnit to hell
20:11 brrt grrrr
20:11 brrt ok, shall i tell you a story that will amuse you
20:12 brrt or not
20:12 brrt during register allocation, i need to insert loads and stores to ensure that values are in their correct place
20:13 brrt in order to load a value, i might need to spill a value
20:13 brrt whenever i spill a value, i spill it right after it is constructed
20:14 brrt i do that by noting the tile number that created it and inserting a 'spill' tile just after that
20:15 brrt however, it is just about possible that the load that overwrites it has to happen just before the next tile
20:15 brrt thus, we have tile i, {spill, load}, tile i + 1
20:16 brrt since spill and load are not relatively ordered, it is just possible that i first load and then spill
20:17 brrt meaning i overwrite the value
20:18 brrt breakage follows
20:19 FROGGS uhh
20:19 jnthn ouch
20:19 timotimo ah, whoops
20:20 brrt this is especially annoying, because i just simplified the tile editor by not requiring such relative orders
20:21 brrt so i'm puzzling how to solve this cheaply and yet robustly
20:21 brrt the power-solution would be to allow tile inserts to specify insert-after or insert-before relations to specific tiles, not to order numbers
20:21 brrt but that makes the insertion code really complex
20:22 brrt because it requires topological sort (i think)
20:22 brrt well, not really, really complex, but more complex than i want to have to debug
20:23 jnthn Of course, you can solve every compiler problem with adding another phase. Like, "spit out virtual registers you can have as many as you like of"..."allocate them" :P
20:23 timotimo everything ought to be topologically sorted anyway %)
20:23 brrt the not-so-robust solution, but one which might be workable, is to give the register allocator an insert-counter
20:23 brrt that is true, except for too slow compilation
20:23 brrt case in point: scala
20:23 jnthn Right :)
20:24 jnthn Yeah, every problem except too many passes.
20:25 brrt and the theory would then be, because i'm deciding to insert the load based on the fact that its value has already been spilled, the spill must have been inserted before the load
20:26 brrt ('its' means: the value in the register)
20:26 brrt thus, sorting by the insert-sequence number secondary to sorting to the tile order number would give us the necessary relative ordering
20:26 brrt but it feels rather breakable to me
20:26 jnthn Yeah, it has a slightly fragile feel to me also
20:27 brrt its made more complicated because a pre-coloring pass (which is in the works) assigns registers in backwards order
20:27 timotimo i'm minimizing a bunch of crashing test cases now, so that i can download it (and give it to y'all)
20:28 brrt i.e. you detect which register to assign a value based on the later consumers of that value
20:28 brrt that will need some work with the register assignment logic, fwiw...
20:28 timotimo yeah, it's a bit more pull than push, isn't it?
20:28 brrt yes
20:29 brrt so that complicates manners, but i'm not sure it breaks them
20:30 brrt the insert-before relationship also requires that i mange the tile that created a value descriptor
20:31 brrt otherwise, i can't easily find the tile that created my descriptor that i'm kicking out of the register
20:34 brrt i have half a mind to rename MVMJitValueDescriptor to MVMJitValue
20:35 jnthn Well, if you talk about them a lot then it's a load less verbose...
20:35 * jnthn just went to take trash, and on returning to his apartment noticed it smells a little like walking into an Indian restaurant :)
20:36 jnthn The air outside has been notably warmer/more humid than I have inside today, so been reluctant to open windows. :S
20:38 jnthn brrt: I'm wondering a bit if things get simpler if the tiles were projected down to a linear bunch of instructions before doing the allocating? Or is that just restating the "add another pass" approach?
20:38 brrt tiles already are a linear list of instructions :-)
20:39 brrt so the pass is already there
20:39 jnthn oh :)
20:40 brrt well, they didn't used to be, they used to be tagged to the tree, but that gave lots of conceptual difficulties
20:40 jnthn So in that sense the tiles "don't exist" by this point in some sense, we just have a linear bunch of instructions and a CFG?
20:41 brrt correct
20:41 brrt i fear i have a terminology screwup again though
20:42 brrt when i say 'tile', i mean the thing that stands in for the generated bytecode
20:43 brrt i don't have a CFG for it, either, yet
20:43 brrt but that is in the plans
20:45 brrt the relation between the tree and the 'tile' is pretty weak at that point
20:46 brrt (lol @ indian restaurant)
20:48 FROGGS tsss, brits :P
20:52 jnthn :P
20:52 jnthn Worth it...it tasted good :)
20:54 timotimo i think i'm maxing out hack with my niced processes ...
21:02 timotimo the first crash thingie is already taking ages to minimize ...
21:05 timotimo the first stage is removing blocks, which is difficult in .moarvm files i think
21:06 timotimo [+] Block removal complete, 18 bytes deleted.
21:07 FROGGS wow, 18 bytes :D
21:07 timotimo out of half a megabyte
21:07 FROGGS that's like... more than I have fingers *g*
21:09 FROGGS gnight
21:10 timotimo it's super annoying that everything internet related, but especially ssh sessions, are suuuuper laggy right now :(
21:11 * brrt done for tonight
21:30 timotimo now it's doing a second pass ... that'll take a long amount again :\
21:30 timotimo somehow it's removing more blocks now, though
22:00 timotimo cool, xz'd it's now only 4.3K instead of the uncompressed 452K

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary