Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2014-11-09

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:03 danaj joined #moarvm
01:54 timotimo a curcode should be speshable immediately into a static pointer set, right?
01:54 timotimo and getcodeobj should also then be constant-foldable?
01:55 timotimo curcode and getcodeobj appear ~820 times in the setting
01:55 timotimo always in code that looks very, very similar
01:55 timotimo not 100% sure what that is, exactly
01:57 timotimo hmm. if i turn curcode and getcodeobj into a constant pointer, that removes a :noinline instruction from that frame and maybe it'd become inlinable?
02:01 timotimo the fact that the code_ref is inside the Frame but not in the StaticFrame makes me think that wouldn't work ...
02:04 timotimo it seems like a proto's dispatcher generates two curcode + getcodeobj calls
02:41 JimmyZ joined #moarvm
02:51 JimmyZ Is there a reason that it can't move code_ref to StaticFrame? :P
02:54 timotimo i don't know much about this part of moarvm
02:54 timotimo if the proto actually does have to call curcode twice and can get different results each time ... that would make that code legit
02:54 timotimo otherwise, it should at least be possible to unify the values and only curcode + getcodeobj once
02:55 JimmyZ looks like it's for state variables
03:00 timotimo oh
03:01 FROGGS[mobile] joined #moarvm
09:29 lizmat joined #moarvm
09:31 woolfy joined #moarvm
09:40 camelia joined #moarvm
09:48 woolfy left #moarvm
11:04 ggoebel111111114 joined #moarvm
11:57 zakharyas joined #moarvm
13:35 timotimo jnthn: this one shouldn't require too much wakeness: can curcode + gedcodeobj be spesh'd into something static? i.e. constant folded? does having two pairs of these close by each other make sense? i.e. will the value that you get from them change through some kind of event?
13:36 timotimo because every single one of our protos seems to be doing curcode + getcodeobj twice in relatively close succession
13:36 JimmyZ oh, good morning, timotimo!
13:36 jnthn No, because curcode = current closure
13:37 timotimo OK
13:37 timotimo good morning JimmyZ :)
13:37 jnthn And since generated proto methods all come from a single static thing...
13:37 jnthn ...it'd surely screw up to constant fold it to...something.
13:37 timotimo OK :)
13:38 timotimo since you've been gone, i've been grasping for straws regarding performance improvements m)
13:38 jnthn For non-generated we might do better but...hmm.
13:38 jnthn It's not immediately clear how
13:38 jnthn If we can do better we'd want to just get code-gen to spit out a wval when we know it's OK...
13:38 timotimo OK, that'd be good on all back-ends
13:39 timotimo i found something else that stumped me. in control.pm, we have a "my sub THROW" and we use that in take and take-rw
13:39 timotimo now the sub THROW ends in "0" explicitly so that there's no return-value overhead
13:39 timotimo but the users of THROW all inspect the return value for sinkability ... and they also look up "&THROW" indirectly
13:39 timotimo i don't understand why :\
13:42 timotimo is that just something the optimizer should have picked up? or should code-gen have done better?
13:42 timotimo maybe i should inspect what spesh does with that before going on
13:46 jnthn Not sure without looking
13:46 jnthn The indirect lookup sounds like a code-gen thing though
13:46 timotimo i'll follow that hint
13:47 timotimo even though take and take-rw are likely to see a big change for glr, right?
13:50 jnthn Well, if those two are the ONLY things using THROW, we may just consider inlining it by hand into both
13:51 timotimo mhm
13:51 jnthn And get rid of THROW
13:51 timotimo i think 4 things actually use it
13:51 timotimo or something
13:51 jnthn Ah
13:51 timotimo i'll look
13:51 jnthn That's...a little worse :)
13:52 timotimo last, next, redo, succeed, proceed all use it
13:52 timotimo with a "my sub", is it guaranteed to not change? i would assume so
13:53 timotimo surely i'll find more meaningful stuff to change for performance improvements, though.
13:53 timotimo and the code-gen for sink is going to change soon anyway; iirc sinking will become part of p6decontrv or something like that
13:54 timotimo one other thing i saw: every nativecallinvoke is preceeded by a call to a generic "decontall" sub
13:54 timotimo i was thinking we should be able to do better than that, especially since we know the number of arguments to the native call when we spesh
13:55 timotimo but i'm bothering you too much now :)
13:55 jnthn "my" just means "lexical"
13:55 jnthn uh
13:55 jnthn And it's the default
13:55 jnthn So the "my" is kinda meangingless here
13:55 jnthn Or at least, surpluss
13:57 timotimo oh
13:57 timotimo it doesn't have "my" there (any more?)
13:57 timotimo i didn't see the jvm ifdefs, it seems like
14:01 timotimo jnthn: http://t.h8.lv/add_core_op_extops_and_dataflow.svg ← you might like this :3
14:10 jnthn So diagram!
14:10 jnthn Wow :)
15:30 colomon_ joined #moarvm
15:38 colomon_ joined #moarvm
16:20 zakharyas joined #moarvm
17:28 zakharyas1 joined #moarvm
17:50 FROGGS_ joined #moarvm
18:59 timotimo jnthn: we have sp_findmeth, maybe sp_can_s would be interesting to have, too?
18:59 timotimo i've seen multiple cases where we first can_s, then sp_findmeth
19:01 jnthn We'd really like to find a way to turn those lookups into a single hash lookup...
19:15 timotimo that's a change that probably goes much deeper than what i feel comfortable trying to do myself :S
19:15 timotimo i see no elegant way to improve the decont_all situation, as we don't know the number of arguments coming in at code-gen time :(
19:16 timotimo and the lack of working call graph pruning kinda drives me up the wall, but i don't think i can figure out what's wrong with the code i already pushed :\
19:32 timotimo interesting
19:32 timotimo for postinc native spends 15% of its time garbage collecting, but the allocations tab only shows 2 allocations all in all
19:33 timotimo 782 gc runs for 40964096 runs, each taking about 7ms
19:36 jnthn Then I'm guessing the allocations tab is missing something...
19:37 timotimo after a whole bunch of GCs where we promote nothing at all, would it be sensible to do a gen2 collection to make future collections cheaper?
19:38 jnthn Make them cheaper how?
19:38 timotimo maybe having fewer gen2-to-nursery links?
19:38 jnthn We prunse those each nursery collect, no?
19:39 jnthn Unless you mean, the gen2 objects themselves may go away
19:39 jnthn In which case, yes, it may be worth having an absolute threshold as well as the promotion based one, but we'd want to set it fairly high.
19:40 jnthn And I'm still not sure it's worth it
19:44 timotimo i'll try setting it to 200 and see what happens to the gc times after vs before
19:56 timotimo no visible change at all
19:57 timotimo except it takes 10x as long for a single gc run, which is the full collection
21:12 timotimo i'm now analyzing the nursery's contents
21:22 timotimo jnthn: https://gist.github.com/timo/1eef31e3be2e7e816bef - this is what the nursery looks like in that benchmark
21:23 timotimo we shouldn't be putting that many MVMStaticFrame instances into the nursery, aye?
21:23 timotimo shouldn't we have only a single one per existing frame
21:25 timotimo i don't know very much about moarvm, but i do know we shouldn't have 11k static frames being created for each of the gc runs of which this benchmark has 782
22:21 lizmat joined #moarvm
22:45 woolfy joined #moarvm
23:10 timotimo so i've breakpointed the gc_allocate for MVMStaticFrame and that didn't get triggered after the first gc run was over
23:10 timotimo so we're just cloning them memcpy-style?
23:10 timotimo all over the place?
23:13 timotimo freshcoderef creates static frames by cloning

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary