Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2016-09-02

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
05:50 dalek Heuristic branch merge: pushed 25 commits to MoarVM/even-moar-jit by bdw
05:50 brrt joined #moarvm
05:55 brrt good * again
05:59 brrt hey #perl6
06:00 brrt eh
06:00 brrt #moarvm
06:00 brrt honest question
06:00 brrt is writing a binary heap obscure?
06:00 brrt (an implementation of)
06:01 konobi brrt: i'd have a look at libconcurrencykit for something like that
06:01 brrt well, i wasn't necessarily asking for a concurrent binary heap
06:02 brrt although that is possible too, and fun
06:02 konobi this is concurrent in NUMA terms
06:02 brrt okay, for the fun of it i'll check out libconcurrencykit, but that was not really my question
06:02 konobi so dealing with the swathe of optimized versions that work across numa implementations
06:03 konobi what are you trying to get out as an answer?
06:04 brrt 'yes, i wouldn't write a binary heap unless it was really necessary and i had done proper preparation', 'no, writing a binary heap is childs play', and the latter possibly followed by 'which is why i don't spend my time on childish things' or possibly 'i do it all the time'
06:05 brrt basically, i'm trying to assess if the people who claim that writing a binary heap is obscure - something i see quite a bit - are thinking in the same way about it as I am, or whether my perspective is warped somehow
06:05 brrt i know writing binary heaps is a simple exercise that nearly everybody doe, or rather, most people with CS education do
06:06 brrt it's not just about the binary heaps though :-)
06:07 geekosaur you may be seeing a division between people with formal CS / algo training and those of us without it
06:07 brrt hmm
06:07 brrt okay
06:07 * geekosaur doesn't even remember what a binary heap is ... self taught, algorithms is still a shortcoming
06:08 brrt a binary heap is a tree structure in which every node has two children and satisfies the heap propertie in which every node is larger than both of its children
06:08 brrt it's usual implementation is in a linear array
06:09 brrt hmm okay
06:09 brrt but that is confusing...
06:10 brrt (and it's purpose is to get the smallest element at constant time while maintaing a heap through inserts and deletes with log(n) time)
06:14 brrt otoh, my perspective probably /is/ warped; i started out with C, from a book, that had me do exercises on merge sort and quick sort, linked lists and trees just to get started
06:14 brrt if i comparse that with what I later did making PHP websites, then, yes, binary heaps are comparatively obscure
06:26 brrt decommute, apologies for being weird this morning &
06:40 domidumont joined #moarvm
06:44 domidumont joined #moarvm
06:45 lizmat joined #moarvm
08:10 zakharyas joined #moarvm
10:12 TheLemonMan joined #moarvm
10:13 arnsholt Yeah, I think the division (re: binary heap) is mostly theoretical background vs. non-theoretical background
10:14 arnsholt A binary heap isn't *terribly* hard to implement, but it's a data structure whose utility isn't necessarily immediately obvious
10:31 lizmat joined #moarvm
10:51 moritz the problem with big binary heaps is that they tend to be not very cache friendly
10:58 arnsholt Yeah, there's a very interesting Poul-Henning Kamp article about that
10:58 arnsholt http://queue.acm.org/detail.cfm?id=1814327
11:08 konobi yeah, it's why i had mentioned libck... as they have a variety of heap implementations, each with better relative performance based on workload (and data/graphs to provide evidence for that)
11:09 nine joined #moarvm
11:09 mtj_ joined #moarvm
11:09 nwc10 joined #moarvm
11:10 mtj__ joined #moarvm
11:11 mtj_ joined #moarvm
11:22 lizmat joined #moarvm
12:01 lizmat joined #moarvm
12:17 TheLemonMan jnthn, moarvm's #145 can be closed
12:21 TheLemonMan and #175 too, I can't reproduce it anymore
12:35 JimmyZ #175 maybe need a test
12:38 lizmat joined #moarvm
12:40 jnthn Seems JimmyZ++ beat me to closing it :)
12:41 JimmyZ didn't close #175 yet, I think is needs a test to roast
13:09 * TimToady fondly remembers his first heap, written as part of a scheduler for a discrete event simulator
13:10 TimToady written in Pascal, but definitely not a toy
13:11 TimToady ran on a very expensive Amdahl, so speed was of the essence
14:23 brrt joined #moarvm
14:23 * brrt thinks heaps are just about one the most awesome data structures so is probably biased
14:23 brrt *one of
14:24 timotimo heaps are cool
14:24 timotimo the thing is, that heap that's presented in that article is also a heap
14:24 timotimo its allocation strategy is quite a bit more complicated than binary heaps', but as soon as just a single page has been swapped out and has to be grabbed from even an SSD, it outperforms a binary tree
14:25 brrt yeah, it's a heap, and it's is a binary heap in the sense that a heap node has two children
14:25 timotimo on the other hand, you're writing our JIT. you probably instantiate that data structure at the beginning of jit compilation and purge it at the end
14:25 brrt yes
14:25 timotimo that's true
14:25 brrt so, i'm kind of having the nagging feeling PHK overstated his achievements somewhat
14:26 brrt and, yes, i'll kill my data structures as soon as i get the chance :-)
14:26 timotimo data structures must perish, yeee-art!
14:27 brrt but, on the other hand, not too quickly
14:27 brrt i'm wondering if i can figure out the worst-case storage requirements for the definitions-and-uses array
14:28 brrt to put it shortly: i can and do precompute the number of definitions-and-uses which i need
14:28 brrt i need definitions-and-uses to figure out where to insert stores and loads after spilling, and whether to split a live range
14:29 brrt however, if i spill something, then the single live range covering it is split into N live ranges (one for each definition and one for each use)
14:30 brrt <rubberduck>: so suppose i have X live ranges, and all together they have Y definitions and Z uses...a
14:30 brrt now, i know that X <= max(Y,Z)
14:30 timotimo you think someone's going to purposefully write code that explodes the size requirements of the jit? :)
14:31 brrt no, i think, how am i going to make space for all the 'tiny live ranges' that have split of from the main live range
14:31 brrt considering that the live ranges themselves are on a dynamic array so i can add them as i like
14:32 brrt but i'd prefer to have not X tiny buffers of arrays, most of which contain 1 definition and 1 use, for each of the definitions/uses per live range
14:32 TimToady is that typical?
14:32 brrt in unoptimized expr jit code, yes
14:32 brrt when i have written the optimizer, the case of single-definition-many-uses will be more common
14:33 * TimToady backs away from the yak
14:33 brrt yak? :-)
14:33 TimToady the one you're about to shave...
14:33 brrt haha
14:34 brrt yaks like these make projects like graal/truffle make sense
14:34 brrt but, anyway
14:35 brrt what i do, is i allocate a single buffer of Y + Z in size, and assign pointers into that to each of the live range objects
14:35 brrt however, i may have to spill some of the live ranges....
14:35 brrt hmmm
14:36 brrt let me think about that for a bit
14:37 brrt suppose we have a live range that is like this: { defs => [ 1, 4, 9], uses => [2,3,5,8,13] }
14:37 brrt spilling this live range means:  - inserting stores after 1,4,9; inserting loads before 2,3,5,8,13
14:37 brrt producing not 1 but 3 (for all defs) + 5 (for all uses) live ranges
14:38 brrt (let's ignore the fact that 1 and 2 and 3 are just after one another)
14:39 brrt so what i get is: [ { defs => [1], uses => [1] }, { defs => [2], uses => [2] }, { defs => [3], uses => [3] }; -- where the 'defs' before the old uses refers to the _load_ i insert before, and the 'uses' refer to the _store_ i insert after
14:41 brrt however, that at most makes Y + Z new live ranges, Z new definitions, and Y uses
14:41 brrt does that make sense?
14:41 brrt it means i can 'fragment' my live ranges as much as i like and never need more than Y + Z live ranges and (Y + Z) * 2 uses/definitions
14:42 brrt and since memory is cheap, i'll allocate X + Y + Z live ranges, and will never have to reuse an old live range
14:43 brrt /splitting/ live ranges complicates that, though
14:44 brrt however, not by all that much, since much the same logic applies....
14:44 brrt also
14:45 brrt when i'm going to split a live range, that's usually because i'm going to spill part of it
14:45 brrt so it is feasible to allocate the 'spilled' part temporarily and then the usual logic applies again
14:49 brrt on the other hand
14:49 brrt splitting is an advanced feature
14:50 brrt doesn't have to be in there immediately
14:53 timotimo doesn't sound like it's optional :)
14:57 brrt spilling isn't optional. but splitting is being clever about spilling
14:57 * [Coke] is sad you cannot shave the yaks in FarCry 4.
14:58 brrt i.e. suppose i need a register used by our fictional live range after sequence point 5; then all of [1,4] and [2,3,5] can use the in-register version
14:58 * timotimo afkbbl
14:58 lizmat joined #moarvm
15:18 domidumont joined #moarvm
15:21 domidumont joined #moarvm
16:24 domidumont joined #moarvm
17:39 timotimo http://jcdav.is/2016/09/01/How-the-JVM-compares-your-strings/  -  i guess we're still very far away from "string comparison is a big part of our performance", but this is still an interesting read
17:53 hoelzro timotimo: that guy's blog is really interesting
17:54 hoelzro I got hooked on it when I saw his post on how the JVM uses SIGSEGV tricks to implement a GC fence
17:54 * hoelzro doesn't know if that's the right term
17:54 hoelzro http://jcdav.is/2015/11/09/More-JVM-Signal-Tricks/
18:03 zakharyas joined #moarvm
18:32 TheLemonMan iirc D uses a similar approach
18:35 hoelzro oh, really? that's cool
18:38 domidumont joined #moarvm
18:40 TheLemonMan hoelzro, https://github.com/dlang/druntime/blob/master/src/core/thread.d#L1910-L1913
18:54 hoelzro hmm, interesting
18:54 hoelzro so the thread that's GC'ing (I presume) fires a SIGUSR1 at each thread to get them to pause?
18:54 hoelzro how do they know that that thread is in any sort of consistent state?
19:01 TheLemonMan if there's something that modern database taught us is that consistency is overrated :P
19:03 timotimo At first glance, this doesn’t seem to serve any real purpose right before the return. Rather than doing any useful work, its just a way of trying to read from 0x7fce84071000, the designated polling page. Modifying access to this page lets the JVM stop threads cleanly - in places where the state of the world is well-known.
19:03 timotimo http://jcdav.is/2015/11/09/More-JVM-Signal-Tricks/
19:22 jnthn iiuc, you just try to deref something at each safepoint and if it SIGSEGVs or so then you enter the GC
19:23 jnthn It avoid the branches around various Moar safepoints
19:23 timotimo right
19:23 jnthn At the cost of needing platform-specific trickery :)
19:23 timotimo right
19:23 timotimo not something we need immediately
19:23 timotimo otoh, we can mprotect "don't access" other threads to make them grind to a halt
19:23 timotimo when we want to dump backtraces of all threads and exit the interpreter immediately
19:24 timotimo probably not easy to make work right
19:24 jnthn Remeber that this doesn't free you solving the blocked threads problem
19:24 timotimo well, if we were going to crash and burn anyway ... :)
19:25 timotimo you mean threads blocked with something that sets them "blocked"?
19:25 jnthn A thread blocking on I/O, a mutex, etc. is not executing
19:25 jnthn Right
19:25 timotimo the only big problem about that is when it goes on to immediately continue doing stuff
19:25 timotimo er, i suppose that's already obvious to you
19:25 jnthn At the moment the Moar mechanism for that + safepoints is unified into a single flag :)
19:25 jnthn Held per thread for cache friendliness
19:26 timotimo mhm, fair enough
21:42 lizmat joined #moarvm
22:45 dalek joined #moarvm

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary