Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2017-08-05

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:27 MasterDuke timotimo++ the heaptrack graphs after your rand_I commit look much nicer (for `perl6 -e 'srand(1); say [+] ^1000 .roll(1_000_000)'`)
01:31 timotimo thought so. screenshots, though? :D
01:50 MasterDuke before: http://i.imgur.com/NzSzHUW.png after: http://i.imgur.com/TTqz45D.png
01:50 MasterDuke most of the graphs look pretty similar. steady growth before, plateauing at a much smaller value after (or still steady growth, but with a much smaller max)
01:53 ilbot3 joined #moarvm
01:53 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
02:11 bisectable6 joined #moarvm
02:11 quotable6 joined #moarvm
02:11 unicodable6 joined #moarvm
02:11 bloatable6 joined #moarvm
02:11 committable6 joined #moarvm
02:11 coverable6 joined #moarvm
02:11 evalable6 joined #moarvm
02:11 greppable6 joined #moarvm
02:11 benchable6 joined #moarvm
02:11 statisfiable6 joined #moarvm
02:27 lizmat joined #moarvm
02:34 timotimo holy wow the time difference is amazing
02:35 timotimo jesus christ, the difference of max is what 100x?
02:35 timotimo it's allocations in total, though, so parts of that get freed again of course
08:21 pharv joined #moarvm
09:04 edehont joined #moarvm
09:28 nwc10 timotimo: I configured MoarVM with `Configure.pl --ubsan --asan ...` so I assume it's more that I didn't turn *on* the leak checker, instead of "I turned it off".
09:28 nwc10 but yes, I think that that means that there is no leak checker, so no excitement generated even if stuff leaks
10:19 colomon joined #moarvm
10:34 colomon joined #moarvm
11:39 MasterDuke timotimo: the op name for coverage options: controlcoverage, coveragecontrol, control_coverage, coverage_control? or do you have any other name suggestions?
11:40 dogbert2 joined #moarvm
11:46 jnthn MasterDuke: If it's user-facing, we don't put _ in op names except for type suffixes at the end
11:47 timotimo i'd like coveragecontrol
11:47 jnthn We also have prefixes like sp_ for non-public ops used for the VMs internal purposes
11:47 timotimo sp for spesh
11:48 MasterDuke it's user facing, so coveragecontrol it is
11:51 nwc10 lunch&
12:11 robertle joined #moarvm
13:16 nwc10 joined #moarvm
14:22 AlexDaniel joined #moarvm
15:16 nine I'm writing a realloc helper that initializes the new memory with 0 (could call it recalloc) for fixing serialization reading uninitialized memory. Should I put it alongside MVM_realloc as MVM_recalloc?
15:22 timotimo yeah why not
15:26 Geth ¦ MoarVM: MasterDuke17++ created pull request #626: Add nqp::coveragecontrol op
15:26 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/pull/626
15:27 MasterDuke timotimo: how's ^^^ look?
15:29 timotimo huh
15:29 timotimo explain the implementation in interp.c to me?
15:29 timotimo oh
15:29 timotimo of course, it shall only work if the env var is set
15:30 MasterDuke yeah
15:30 timotimo the first argument to coveragecontrol should be r instead of w
15:31 timotimo i thought we'd have a separate value for the env var for "no deduplication of log lines", or is there no use for the thing "in the middle"?
15:32 MasterDuke updated to r
15:32 timotimo better
15:32 timotimo (rather than bettew)
15:33 MasterDuke "mawwiage is what bwings us hewe today"
15:33 MasterDuke timotimo: wasn't sure about the de-dup thing
15:34 MasterDuke and the various use cases
15:34 timotimo i think it's a valid use case to not want dedup
15:34 MasterDuke have to not de-dup when using coveragecontrol() of course
15:36 timotimo i mean, you can subtract startup if you don't dedup, but since we also call coveragecontrol after startup you can just use that, too
15:36 timotimo but maybe you want to see how many times individual parts are run
15:36 MasterDuke could add an option to coveragecontrol just to turn off de-dup
15:36 timotimo aye
15:51 MasterDuke has anyone done a coverity scan recently? might be a good idea after all the recent big changes
15:52 timotimo oh, i haven't thought of that in a long time
16:00 Geth ¦ MoarVM: aa740f6329 | (Stefan Seifert)++ | src/6model/serialization.c
16:00 Geth ¦ MoarVM: Ensure that serialized padding bytes are 0
16:00 Geth ¦ MoarVM:
16:00 Geth ¦ MoarVM: Helps creating reproducible build of precompiled Perl 6 modules
16:00 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/aa740f6329
16:00 Geth ¦ MoarVM: 0304b316dd | (Stefan Seifert)++ | src/6model/serialization.c
16:00 Geth ¦ MoarVM: Avoid uninitialized bytes in default tables
16:00 Geth ¦ MoarVM:
16:01 Geth ¦ MoarVM: These tables are all cretaed with reasonable default sizes but it's quite
16:01 Geth ¦ MoarVM: possible that they are not fully utilized in the end. So we need to initialize
16:01 Geth ¦ MoarVM: them with an equally reasonable default value.
16:01 Geth ¦ MoarVM: Helps creating reproducible build of precompiled Perl 6 modules.
16:01 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/0304b316dd
16:01 Geth ¦ MoarVM: 4ce573d9c5 | (Stefan Seifert)++ | 2 files
16:01 Geth ¦ MoarVM: Avoid uninitialized bytes at the end of serialized tables
16:01 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/4ce573d9c5
16:14 geekosaur joined #moarvm
16:15 MasterDuke timotimo: do you think it ever makes sense to turn off de-duping in the middle of a script/program? or should it be an all-or-nothing option controlled by the env variable?
16:15 timotimo i imagined one-or-nothing
16:18 zakharyas joined #moarvm
16:19 MasterDuke something like MVM_COVERAGE_CONTROL=2 means don't de-dup, but don't wait for an nqp::coveragecontrol(1) to start logging?
16:19 MasterDuke and MVM_COVERAGE_CONTROL=1 means don't de-dup, but *do* wait for an nqp::coveragecontrol(1) to start logging?
16:20 travis-ci joined #moarvm
16:20 travis-ci MoarVM build failed. Stefan Seifert 'Avoid uninitialized bytes at the end of serialized tables'
16:20 travis-ci https://travis-ci.org/MoarVM/MoarVM/builds/261335193 https://github.com/MoarVM/MoarVM/compare/600c2e9cf20b...4ce573d9c568
16:20 travis-ci left #moarvm
16:23 MasterDuke t/hll/06-sprintf.t                   (Wstat: 0 Tests: 286 Failed: 2)    Failed tests:  285-286    Parse errors: Bad plan.  You planned 284 tests but ran 286.
16:28 http_GK1wmSU joined #moarvm
16:29 MasterDuke nine: ^^^
16:31 nine MasterDuke: ah, I can reproduce it locally. I should really adopt the habit of running nqp's tests, too
16:31 http_GK1wmSU left #moarvm
16:32 nine But I somehow doubt that the failure is due to my commits: Parse errors: Bad plan.  You planned 284 tests but ran 286.
16:34 nine Indeed, reverting nqp commit 980a07ee96216a43958d54878e187a7b0abd8cf1 fixes the test
16:36 nine Pushed a fix for the test plan. It was really just 2 new tests.
16:42 MasterDuke nine++
16:54 Ven joined #moarvm
17:07 pharv joined #moarvm
17:11 benchable6 joined #moarvm
17:11 committable6 joined #moarvm
17:11 statisfiable6 joined #moarvm
18:36 Geth ¦ MoarVM: ff571d17fd | (Stefan Seifert)++ | src/core/alloc.h
18:36 Geth ¦ MoarVM: Fix "error C2036: 'void *' : unknown size"
18:36 Geth ¦ MoarVM:
18:36 Geth ¦ MoarVM: Thanks to ugexe++ for reporting
18:36 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/ff571d17fd
18:40 MasterDuke there seems to be some memory leak/growth after 2017.07
18:41 MasterDuke 2017.07: http://i.imgur.com/nx2EGjw.png HEAD: http://i.imgur.com/8USB7cx.png
18:58 travis-ci joined #moarvm
18:58 travis-ci MoarVM build passed. Stefan Seifert 'Fix "error C2036: 'void *' : unknown size"
18:58 travis-ci https://travis-ci.org/MoarVM/MoarVM/builds/261366040 https://github.com/MoarVM/MoarVM/compare/4ce573d9c568...ff571d17fd17
18:58 travis-ci left #moarvm
18:59 jnthn MasterDuke: Is that really what the data is saying?
19:01 jnthn The line it's highlighting is:
19:01 jnthn void *tospace = MVM_calloc(1, MVM_NU
19:01 jnthn u
19:01 jnthn void *tospace = MVM_calloc(1, MVM_NURSERY_SIZE);
19:02 ilmari nine: -Werror=pointer-arith would have caught that
19:02 jnthn The reason we allocate a lot more there now is because instead of memsetting a ~4MB region, we just free it and calloc a new one
19:02 jnthn Because measurements showed that to be cheaper than the memset
19:03 jnthn callgrind thought a vast amount cheaper; actual time measurements showed rather less
19:03 jnthn But yes, this will mean that our "churn" in terms of allocation calls will be a lot higher
19:03 MasterDuke hm, makes sense
19:04 Ven joined #moarvm
19:04 jnthn Given that the memory in question is fromspace that will be cache cold by then, though, then I don't think there's much chance that calloc will do a slower job than we can with memset
19:05 MasterDuke AlexDaniel thinks the whateverables are slower and leakier than they were recently
19:05 jnthn Yeah
19:05 jnthn I think the top change it points to isn't the one we're after, alas
19:06 jnthn What code is that btw?
19:06 MasterDuke k, any ideas where to look?
19:06 jnthn And why on earth is it decoding a million ASCII strings
19:06 MasterDuke that i heaptracked?
19:06 jnthn Yeah
19:06 jnthn A million allocations in ascii decode must mean a million strings decoded
19:07 jnthn Which is, uh...possible curious
19:07 MasterDuke `perl6 -I Text-Diff-Sift4/lib/ -e 'use Text::Diff::Sift4; my $d; for ^1000000 { $d = sift4(~$_, "10") }; say $d'`
19:07 jnthn *but curious
19:07 jnthn Hm, is it calling .decodee somewhere in there? :)
19:07 MasterDuke nope
19:08 jnthn Oh, it's doing a million iterations
19:08 MasterDuke https://github.com/MasterDuke17/Text-Diff-Sift4/blob/master/lib/Text/Diff/Sift4.pm6
19:08 jnthn So it's only doing it once I guess
19:09 jnthn such, I don't see what in there would be doing it either
19:10 jnthn Oh
19:10 jnthn ~$_
19:10 jnthn It'll be that :)
19:10 jnthn Turning a number into an MVMString
19:10 jnthn OK, mystery solved :)
19:10 jnthn Does this code leak btw?
19:10 jnthn Or just show more churn?
19:12 MasterDuke well, /usr/bin/time for 1000 iterations shows 82264maxresident)k, 1000000 iterations shows 95516maxresident)k
19:13 jnthn That could just be the GC finding its natural overhead and/or spesh logs/stats/optimized code using a bit of space
19:13 MasterDuke yeah, 10m iterations is 95704maxresident)k
19:13 jnthn Yeah
19:14 jnthn walk; bbiab
19:16 Geth ¦ MoarVM: ilmari++ created pull request #627: Add -Werror=pointer-arith for gcc
19:16 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/pull/627
19:40 MasterDuke timotimo: any thoughts about my last MVM_COVERAGE_CONTROL=* comments?
19:41 timotimo ah yes
19:41 timotimo that's how i imagined it
19:42 MasterDuke cool, i'll see about adding that to the PR
19:42 timotimo great!
19:42 timotimo thanks a lot
19:43 MasterDuke heh, likewise, you did the hard part of actually implementing line coverage
19:44 timotimo you mean "the easy part", right? :)
19:45 MasterDuke right, bet it was a breeze
19:47 MasterDuke http://i.imgur.com/7DOfHQt.png
19:48 MasterDuke re memory stuff, does ^^^ look more interesting
19:50 MasterDuke it's from running this https://gist.github.com/MasterDuke17/685b627a6a2749483dc5ec09c6a777a4
19:52 MasterDuke /usr/bin/time for 10 iterations shows 1773636maxresident)k, 20 iterations shows 2635308maxresident)k
19:55 MasterDuke https://github.com/MoarVM/MoarVM/blob/master/src/io/procops.c#L509 is what heaptrack says is the biggest allocator (after the calloc in gc we talked about earlier)
19:55 timotimo i wonder what the suggested size would turn out to be?
19:56 timotimo is it perhaps wildly overestimated?
20:00 MasterDuke it's always 65536
20:08 AlexDaniel MasterDuke: so Sift4 by itself does not leak?
20:09 MasterDuke doesn't seem like it
20:10 AlexDaniel uhhhhh
20:11 AlexDaniel then it has to be something here https://github.com/perl6/whateverable/blob/master/Whateverable.pm6#L205-L231
20:11 timotimo MasterDuke: could you check what buf->len and nread are in async_read when it repr_push_o the newly created buffer object into "arr"?
20:11 timotimo that way we could figure out if we keep much-too-big buffers around instead of maybe reallocing them to shrink 'em
20:12 AlexDaniel it leaks tens of megabytes per one execution of get-similar, so the leak should be associated with the data
20:12 MasterDuke line 531?
20:12 Geth ¦ MoarVM: ece9788ee0 | (Dagfinn Ilmari Mannsåker)++ | build/setup.pm
20:12 Geth ¦ MoarVM: Add -Werror=pointer-arith for gcc
20:12 Geth ¦ MoarVM:
20:12 Geth ¦ MoarVM: This disables the GNU extension that allows arithmetic on pointers to
20:12 Geth ¦ MoarVM: void and functions, which standard C (and MSVC) don't allow.
20:12 Geth ¦ MoarVM:
20:12 Geth ¦ MoarVM: Also remove redundant -Wdeclaration-after-statement: -Werror=foo implies
20:12 Geth ¦ MoarVM: -Wfoo.
20:12 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/ece9788ee0
20:13 timotimo around there, yeah
20:14 MasterDuke buf->len is always 65536, nread ranges from 41 to 11767
20:15 timotimo okay, if we don't shrink them at all we'll be leaving a lot of memory on the table
20:15 timotimo however
20:15 timotimo we're likely to just immediately decode the buffers, i believe
20:15 timotimo so they should actually be thrown away quickly
20:16 AlexDaniel ok I can reproduce it
20:17 AlexDaniel https://gist.github.com/AlexDaniel/5763d0bcb261f11733bd15893b7e2a85
20:19 AlexDaniel if my math is correct, it's about 128 MB per call
20:19 AlexDaniel unfortunately I have to go now…
20:20 timotimo thanks for the effort
20:20 AlexDaniel well, I'll get back to it soon
20:20 AlexDaniel get-similar only needs get-output, which is essentially just “run”
20:21 AlexDaniel so this looks golfable
20:21 AlexDaniel anyway, cya
20:22 timotimo MasterDuke: you think it'd be a good idea to realloc these bufs to a number closer to the number of bytes read? do these buffers stick around?
20:25 MasterDuke no idea
20:27 MasterDuke i think this is the first time i've looked at procops.c
20:28 Ven joined #moarvm
20:28 MasterDuke could try and see what happens. where should they be realloced?
20:30 timotimo just before they get pushed there
20:31 timotimo same location i pointed at before
20:35 dogbert11 joined #moarvm
20:38 MasterDuke the buffer is created via MVM_repr_alloc_init?
20:39 MasterDuke `res_buf->body.ssize    = buf->len;` should be `res_buf->body.ssize    = nread;` instead?
20:44 jnthn No, because otherwise the heap profiler can't account for it properly
20:44 jnthn ssize is meant the be the allocated amount, which is what buf->len is
20:44 jnthn elems is the amount actually in use
20:45 jnthn libuv seems at least with sockets to always tell us a quite big amount even if it in reality only puts some hundreds of bytes in there, though. There's quite possibly a good reason for that.
20:46 MasterDuke is there something wrong in the perl6 code we're running?
20:47 MasterDuke which is essentially this: https://gist.github.com/MasterDuke17/685b627a6a2749483dc5ec09c6a777a4
20:48 jnthn Not so far as I can see.
20:48 jnthn Huge leak though o.O
20:48 jnthn I wonder if it's related to the one I've noticed with an async socket server in a react block
20:49 jnthn Planning to investigate that one some more next week
20:49 jnthn No particular idea what is going on
20:49 MasterDuke sounds similar
20:51 colomon joined #moarvm
20:56 MasterDuke any pointers where to look?
21:00 colomon joined #moarvm
21:07 Geth ¦ MoarVM: dogbert17++ created pull request #628: Remove unused vars
21:07 Geth ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/pull/628
21:07 channing1 joined #moarvm
21:37 dogbert11 MasterDuke: tried your gist, although I ran fewer iterations. Get the impression that it leaks ~10k per iteration
21:39 dogbert11 that leak seems to involve newly added functionality
21:39 MasterDuke huh, pretty sure i'm seeing more than that. wonder if it's the 32vs64 bit?
21:40 dogbert11 could be, and perhaps the fact that the jit doesn't work in 32 bit might have something to do with it
21:40 AlexDaniel so does it leak or not? :)
21:41 AlexDaniel ah, I see
21:41 AlexDaniel ok
21:41 AlexDaniel MasterDuke++
21:42 AlexDaniel would be nice to have a ticket with this
21:42 AlexDaniel as far as I know, this is a regression
21:43 AlexDaniel at least, I'm 90% sure that it was working OK before the most recent changes
21:45 MasterDuke AlexDaniel: do you see any difference if you disable spesh or jit?
21:45 AlexDaniel I see no difference
21:46 AlexDaniel anyway, later :)
21:49 dogbert11 MasterDuke: your gist contains this line which is commented out, # flat(@options, @tags, @commits).min: { sift4($_, $tag-or-hash, 5, 8) }
21:50 MasterDuke yeah, it's adapted from whateverable code
21:50 dogbert11 and you get a big leak without it?
21:51 MasterDuke yeah, that's commented out in the "real" code also
21:52 dogbert11 valgrind doesn't find any big leaks but using /usr/bin/time I see that maxresidentk gets higher the more iterations I run
21:53 dogbert11 it's around 0.5gig after 20 iterations
21:56 dogbert11 almost as if the GC isn't doing what one might expect, odd
21:58 timotimo you can ask the telemetry log how often gc runs if you're worried it doesn't do enough
21:59 MasterDuke i updated my gist with the output of running for 5 iterations with valgrind --leak-check=full
22:01 MasterDuke big jumps in amount lost here: https://gist.github.com/MasterDuke17/685b627a6a2749483dc5ec09c6a777a4#file-valgrind-log-L95-L174
22:01 dogbert11 yeah, 66k definitely lost
22:03 dogbert11 most of what is 'new' comes from spesh stats/plan related stuff. The unicode thing have been there for ages :)
22:04 MasterDuke valgrind output looks the same even with an nqp::force_gc() in the loop
22:06 dogbert11 would it be possible for the amount of memory used by spesh to run amok or are there some kind of limit in place?
22:19 jnthn If MVM_SPESH_DISABLE=1 then it never allocates a log buffer to write into, so the stats stuff never gets data...
22:19 jnthn So if it leaks with MVM_SPESH_DISABLE=1 then it rules out quite a bit.
22:21 MasterDuke the numbers are a little smaller, but still regularly increasing with MVM_SPESH_DISABLE=1
22:21 jnthn Yeah, then it's something not spesh, almost certainly
22:21 jnthn There have been GC changes but...
22:22 jnthn Hard to imagine them introducing leaks :S
22:22 MasterDuke dinner &
22:22 dogbert11 the amount of memory used is quite high though, this is from a 10 iteration run:
22:22 dogbert11 ==20843==     in use at exit: 138,096,415 bytes in 437,172 blocks
22:23 dogbert11 ==20843==   total heap usage: 1,566,554 allocs, 1,129,382 frees, 4,591,697,481 bytes allocated
22:23 dogbert11 definitely lost: 64,928 bytes in 3,183 blocks
22:24 d33pb00k-GK1wmSU joined #moarvm
22:24 d33pb00k-GK1wmSU left #moarvm
22:36 dogbert11 doing the ten iterations with MVM_SPESH_DISABLE=1 leaks 'definitely lost: 1,256 bytes in 29 blocks'
22:37 dogbert11 half of that amount comes from MVM_decoder_configure (Decoder.c:131)
22:41 * dogbert11 has problem seeing the problem :)
22:42 timotimo got an idea how often that gcs in total?
22:42 timotimo might be able to get some insight from a heap snapshot
22:43 MasterDuke i get `Command terminated by signal 11` if i add --profile=heap
22:43 MasterDuke timotimo: how do you turn on telemeh?
22:46 MasterDuke 35 'gc finished' in the telemetry log for a 5 iteration run
22:52 timotimo that's not terribly much
23:37 MasterDuke timotimo: PR updated
23:44 timotimo looks good i think!
23:45 MasterDuke cool
23:46 timotimo anything good to find in the output?
23:47 timotimo also, i wonder when we should change the output format to be separated by something easy-to-find
23:47 timotimo like nul bytes
23:50 MasterDuke heh, i had some actual usecase, but now have spent so long messing with getting it working i don't remember what it was
23:51 timotimo we can figure out what the hot spots are in a program, and probably also an exact trace through the program
23:51 timotimo files are gonna be huge, though
23:52 MasterDuke yep and yep
23:54 MasterDuke will add a column to coverable6's output for hit count
23:55 timotimo mhm
23:55 timotimo got an idea what the performance impact is?
23:57 MasterDuke somewhat minimal, but i think the only time i measured it wasn't a very long or cpu intensive program anyway
23:58 timotimo i have a lock there, right? that'd serialize multiple threads?
23:59 MasterDuke there where?
23:59 timotimo i don't seem to, no

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary