Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2017-01-23

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
02:48 ilbot3 joined #moarvm
02:48 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
06:58 domidumont joined #moarvm
07:35 samcv timotimo, more progress. Running names will print this out https://gist.github.com/13247bbed6c2cca65625365958d906c3
07:36 samcv and we programatically generate the control code's names
07:37 samcv the best part is, UCD-gen.p6 generates the code for which points to generate that are <control-whatever> names
07:37 samcv here is the full source https://gist.github.com/1989fe2c72f7aafa4c4df7b79c968926 (100 codepoints)
07:48 timotimo very cool
07:49 timotimo i expect this mechanism will also be used for variant selectors and for the cjk thingamabobs?
07:49 samcv yeah at least for now. not sure how we want to handle unicode names being indexed
07:49 samcv so this should be pretty adaptable
07:50 samcv and I can re-use the same code for things like ranges of cp's that have the same bitfield index
07:50 timotimo nice
07:50 timotimo but what do you mean "being indexed"?
07:50 timotimo you mean a request "give me the unicode name for codepoint i"?
07:50 samcv being able to go from a cp to a name without searching through all of them
07:50 samcv yep
07:51 timotimo i suggest we build a table that'll give us indices into the uninames array (i.e. the packed 16bit ints)
07:51 samcv every some number of codepoints?
07:51 timotimo for example
07:51 timotimo if we go that route, we'll have to make sure we can address sub-int units, i.e. the base40 codemes
07:51 samcv just give it the index that has the 0 value and so the name begins
07:52 samcv i don't think we need that. and storing pointers to all of them would be a bit wasteful
07:52 samcv we could always skip a codeme though
07:52 timotimo always skip one?
07:52 samcv if we started out seeing the end of the previous string
07:52 samcv err skip. uhm
07:53 samcv each base 40 sig digit place
07:53 samcv was that a codeme, or was the single 40-digit integer a codeme
07:53 timotimo well, i suppose we can guarantee that no string has a name of length two, and the strings that are encoded as just a 0 we can already handle through get_uninames
07:53 timotimo codemes is what a 16bit int has three of
07:53 samcv kk
07:53 samcv i don't see why having a name that is only 2 letters long is an issue
07:54 timotimo oh, only one letter long would be a problem
07:54 samcv 0XX|0 < we have to read two 40-base int's
07:54 timotimo because then you'd have 0, letter, 0 in the int
07:54 samcv why can't we have that?
07:54 timotimo if we index based on 16-bit units, and we say "we give you the 16bit int that has the zero that the previous name ends with", you can't tell if the first zero was meant or the second one
07:54 samcv it reads between the first 0 and the next, and whatever is between those two (codeme wise) is the name
07:55 timotimo ok, so how do you address the next word?
07:55 samcv hm. ok
07:55 samcv i see your point
07:57 timotimo as long as the array of codemes doesn't exceed what we can address with 16bit, it should be very doable to have a fast access table that doesn't waste too much memory
07:58 timotimo and if we reach the point where we no longer can address it with 16bit per entry, we can go from absolute indices to encoding the distance between what we index and turn the binary search into a linear search
07:58 samcv so there's like 0xE01ED names
07:59 samcv m: 0xE01ED.base(2).chars
07:59 camelia rakudo-moar 582260: ( no output )
07:59 samcv m: 0xE01ED.base(2).chars.say
07:59 camelia rakudo-moar 582260: OUTPUT«20␤»
07:59 samcv that doesn't fit into 16-bit
07:59 samcv so I guess we could do it every other
08:00 timotimo you have it the wrong way around i think
08:00 timotimo my idea was this:
08:00 timotimo we have an array that'll tell us: codepoint name for U+100 starts at codeme 45, codepoint name for U+200 starts at codeme 99, codepoint name for U+300 starts at codeme 123
08:01 timotimo so what we have to store is the codeme's address, which is (if we don't have a trick) the length of our 16-bit array times 3
08:01 samcv yeah exactly
08:01 timotimo if we have the "we just point at a 16bit integer and whereever the 0 is that's where the name will start" we can cut that to a third
08:01 samcv so we just store the index of every other unicode name
08:02 timotimo i don't get it
08:02 samcv then we will fit into a 16-bit int
08:02 timotimo we already only store the index of every 100th unicode name
08:02 samcv index 0, cp 0/1, index 3 cp 2/3 etc
08:02 timotimo (that number is up for debate and tuning of course)
08:02 samcv so we know where every odd numbered cp starts
08:02 timotimo oh?
08:02 samcv if it's even it's the second name
08:03 timotimo that sounds smart
08:03 samcv and then it will be fine even if we have 2 character unicode 1 names
08:03 timotimo but if we already use 20 bits and we only address every other, we go to 19 bits, not to 10 bits :(
08:05 samcv true :(
08:05 timotimo we'd have to ... *counts on fingers* ;)
08:05 timotimo every 16th
08:06 samcv yeah
08:10 brrt joined #moarvm
08:10 brrt ohai #moarvm
08:11 brrt perhaps unsurprisingly, timeout(1) in perl is about 10 LoC
08:11 brrt perhaps more surprisingly, our expr JIT failure was really difficult to pin down to a miscompile or a mis-register-allocate
08:12 brrt the broken frame was in fact '!reserve_fates', so that was kind of suspicious in my book
08:12 brrt given that i'd seen a bunch of commits by TimToady that had.. 'something' to do with that
08:13 timotimo huh, interesting
08:13 brrt now i've merged origin/master into even-moar-jit, and my infiinte loop has disappeared
08:13 timotimo i think reserve_fates is absolutely new?
08:13 brrt this is something i know nothing off
08:14 timotimo it has to do with having fate indices globally unique
08:14 brrt i'm perfectly happy to pass the compiled frames for further inspection
08:14 timotimo where globally is, of course, problematic when we precompile stuff that we later load in
08:14 brrt but i can't for the life of me tell what the error is
08:14 brrt hmmm
08:14 Geth MoarVM/even-moar-jit: 23 commits pushed by bdw++
08:14 Geth MoarVM/even-moar-jit: review: https://github.com/MoarVM/MoarVM/compare/349b3605d1...840891ced4
08:15 brrt in fact, the expression JIT does a bunch of interesting thigns, but not that
08:15 timotimo "that"?
08:15 brrt not anything that's obviously broken
08:15 timotimo ah
08:15 brrt heh, i'll make a gist, see what you think
08:17 timotimo so i'll be looking at mvm bytecode on one side and x86_64 assembly on the other?
08:17 brrt https://gist.github.com/bdw/10b46860853822c95935cc7580f002a6
08:17 brrt no, two different compiles of the same frame
08:19 domidumont joined #moarvm
08:19 brrt aha
08:19 zakharyas joined #moarvm
08:19 brrt i think i've spotted it
08:19 timotimo i should be cleaning and tidying :o
08:19 timotimo time's running out
08:20 brrt no matter
08:22 timotimo what's the difference?
08:23 brrt well, i'm adding to r10, while i should be adding to r9
08:23 brrt somehow the register has changed from r9 to r10
08:23 brrt weird!
08:24 domidumont joined #moarvm
08:25 timotimo sounds weird, yeah
08:25 brrt oh, i think i know why it's happened
08:26 brrt the output register is r10, but the first input register is r9
08:26 timotimo ah, and it derped which is out and which is in?
08:26 brrt yeah, this is another variant of three-op/two-op shenanigans
08:26 timotimo yeah, that is difficult
08:28 brrt oh well
08:29 brrt there's probably a generic way to resolve this…. but i need to find it first
08:30 timotimo so is that fixed now or just circumvented?
08:36 timotimo tbqh i'm surprised we could compile such a big frame with the exprjit
08:37 timotimo i thought its selection of tiles was so very limited still
08:37 timotimo there's even some gotos in there
08:41 timotimo wait. that doesn't necessarily mean all of that is exprjit compiled
08:41 timotimo it's quite crazy to me that that code is mostly mov
08:41 timotimo and it's not all useless movs, too
08:50 brrt well, most of it can be erased
08:51 brrt there are a bunch of getlex to the same point
08:51 brrt these can safely be optimized away
08:51 brrt well, for some value of 'safely
08:51 timotimo heh.
09:27 brrt argh, never change the conditions while debugging
09:30 brrt (git reflog is a real lifesaver)
09:31 timotimo oh yes
09:44 Geth MoarVM/even-moar-jit: 6402e4e2b7 | (Bart Wiegmans)++ | src/jit/x64/tiles.dasc
09:44 Geth MoarVM/even-moar-jit: Fix output register for indirect indexed tiles
09:44 Geth MoarVM/even-moar-jit:
09:44 Geth MoarVM/even-moar-jit: Another case of three-operand/two-operand conversion pain.  I feel like
09:44 Geth MoarVM/even-moar-jit: the solution should be generalized, but it's not entirely obvious
09:44 Geth MoarVM/even-moar-jit: how.
09:44 Geth MoarVM/even-moar-jit: review: https://github.com/MoarVM/MoarVM/commit/6402e4e2b7
09:47 brrt yay, solved
09:47 brrt each bug fixed brings us closer :-)
10:02 timotimo neato
10:35 brrt now, i'm kind of wondering what to attack next
10:36 brrt best choice is probably C function call support
10:36 timotimo ah, yes
10:36 brrt because that's the blocking thing, mostly
10:36 timotimo that'll give us a whole bunch of ops in one fell swoop
10:36 timotimo (since we still have that script that parses graph.c)
10:36 brrt indeed
10:52 timotimo i don't really have an overview of what exactly works now
10:52 timotimo the exprjit can now fully compile if/then/else?
10:55 brrt yeah
10:56 samcv nqp::index goes to MVM_string_index right?
10:56 timotimo it should
10:56 timotimo you can check in interp.c
10:57 samcv hmm for some reason i wasn't getting any 2+ needle length calls to that, in cases where both strings were made up of 32 bit graphes
10:57 samcv ah yeah that is what my issue was
10:59 timotimo what exactly?
11:00 samcv just some ascii text
11:00 timotimo oh btw, you can use the xx operator (or is it x?) to build a long string (as a rope) and then turn the rope into a flat array of graphemes by using nqp::indexingoptimized
11:00 timotimo oh, it optimized the text into 8bit graphemes?
11:00 samcv yeah
11:00 samcv i suppose
11:00 samcv ya
11:01 timotimo we used to have indexingoptimized as "flattenropes", which would modify the string in-place. that was a very fragile thing that lead to many explosions :D
11:01 nine .tell brrt I'd attack whatever is keeping others from contributing to the exprejit
11:01 yoleaux2 nine: I'll pass your message to brrt.
11:08 brrt joined #moarvm
11:09 brrt nine, yeah, that's mostly function call support :-)
11:09 yoleaux2 11:01Z <nine> brrt: I'd attack whatever is keeping others from contributing to the exprejit
11:10 brrt so what works now: basic register allocation, spill/loads, marking of register preferences (but not conflict resolution)
11:12 brrt the basic challenge of function calls is that they a): ensure that values in volatile registers are properly spilled on a function call; b): ensuring that values are in their proper places for ABI requirements
11:12 brrt i have a strategy in place to tackle that
11:22 nine :)
12:10 domidumont joined #moarvm
12:27 domidumont joined #moarvm
13:09 ggoebel joined #moarvm
13:12 domidumont joined #moarvm
13:22 zakharyas joined #moarvm
15:10 brrt joined #moarvm
15:11 brrt so, one of the minor issues of complexity
15:12 brrt we can have more arglist arguments than we can have registers
15:13 brrt so; if we tried to load all these arguments in registers (as we would if they were 'normal' references), then we'll have infinite problems
15:13 brrt as a result, we shouldn't really do that
15:15 brrt instead…. we can load it directly from the spilled live range (if any)
15:20 zakharyas joined #moarvm
15:33 brrt hmm, here's an idea
15:33 brrt nope, scrap the idea
15:33 brrt (the idea was: magic-case the handling of ARGLIST in value references. not a good idae)
15:49 brrt but ARGLIST things in general need to be special cased....
15:49 brrt okay, we'll figure it out
16:29 TimToady joined #moarvm
17:27 FROGGS joined #moarvm
17:36 zakharyas joined #moarvm
18:57 ggoebel joined #moarvm
18:59 pyrimidine joined #moarvm
19:33 Geth joined #moarvm
20:05 pyrimidine joined #moarvm
20:30 pyrimidine joined #moarvm
22:02 pyrimidine joined #moarvm
22:45 samcv here is a gist of what I get backtracing moar when running prove on the nqp regex test 01 https://gist.github.com/875b9bbfe6689b0bffedb8a1bb2f74a3
22:57 jnthn That's just invoking a frame
22:57 jnthn It suggests whatever the hang is, it's in high-level code in a loop, not in C code in a loop
22:57 jnthn MVM_dump_backtrace(tc) may help
22:59 jnthn Note that since you attached, stdout will be pointing to the original place it was going
22:59 jnthn So, maybe test harness
22:59 jnthn Can always open a file in the debugger and dup2 or so
22:59 jnthn If -v won't show the output
23:03 timotimo hopefully running the test file directly with nqp-m will also give you the problem
23:04 jnthn Aye, that makes things easier if so :)
23:09 pyrimidine joined #moarvm
23:37 pyrimidine joined #moarvm

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary