Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2017-01-10

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:33 samcv jnthn, what would be the best way to store multiple codepoints for emoji sequences?
00:33 samcv for decomposition we just store a string and then parse it. but there has to be a better way to store multiple things?
00:37 samcv bye bye dalek
00:37 dalek joined #moarvm
00:41 pyrimidine joined #moarvm
02:04 lizmat_ joined #moarvm
02:04 pyrimidine joined #moarvm
02:30 pyrimidine joined #moarvm
02:48 ilbot3 joined #moarvm
02:48 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
03:29 pyrimidine joined #moarvm
04:01 geekosaur joined #moarvm
04:25 geekosaur joined #moarvm
04:40 pyrimidine joined #moarvm
05:14 pyrimidine joined #moarvm
05:36 pyrimidine joined #moarvm
06:59 samcv jnthn, so i got it fully working \o/
07:05 domidumont joined #moarvm
07:10 domidumont joined #moarvm
07:11 nwc10 good *, #moarvm
07:12 samcv hi nwc10 :)
07:18 samcv https://github.com/MoarVM/MoarVM/pull/492 \o/
07:28 Ven joined #moarvm
07:37 brrt joined #moarvm
07:41 brrt good * #moarvm
07:42 * samcv waves
07:45 brrt i'm checking your PR
07:45 brrt 14 files changes, oh my
07:45 samcv need to fix a few things but
07:45 samcv most of those are just adding the op
07:45 brrt oh, wait, indeed
07:45 * brrt would prefer that the oplist were generated at build time....
07:45 samcv what's the best way to construct a string from codepoints btw? i don't think i did it the best way
07:46 brrt honestly, i don't know
07:46 brrt let's check first before i give you an answer to that
07:46 samcv trying to debug a problem when MVM errors for some of them
07:46 samcv i must not be constructing the grapheme properly
07:47 samcv but seems to happen when there's more than 2 codepoints in the sequence
07:47 brrt just for my info, why place getstrbyname before indexat in src/core/interp.c
07:47 samcv dunno
07:47 brrt i'd be surprised if that had any ill effect on the compiled code, but i had expected them to be in bytecode order
07:47 brrt hmmmm
07:47 samcv interp.c doesn't have to be in order
07:47 brrt no
07:48 brrt but i would still expect it to be
07:48 samcv hmm it seems to return the grapheme fine but.
07:48 samcv it errors later on
07:48 samcv MoarVM panic: MVM_nfg_get_synthetic_info called with out-of-range synthetic
07:48 samcv so it's not erroring in the code i made
07:48 samcv well maybe it is. maybe gdb is being weird
07:49 samcv may have not break'd at the right spot
07:49 samcv yeah it seems to panic after adding the 2nd codepoint to the buffer
07:49 brrt i assume you know why the enums have changed, so i'm not going to comment on that, either
07:49 samcv well i think it's erroring when adding the 3rd
07:49 samcv what about the enums?
07:49 brrt have you compiled with --debug
07:49 samcv yep
07:50 brrt hmmm
07:50 brrt have you compiled with —optimize=0
07:50 samcv i need to go into unicode.c which is generated from a compilation of multiple
07:50 samcv plus there's macros
07:51 samcv oh it looks like it's not generating the right number of array items
07:52 samcv ah i see. because my original testing code didn't use * things
07:53 brrt hmmpf
07:53 brrt i honestly have no comments on that PR
07:53 brrt :-)
07:53 brrt well, some things
07:56 samcv uhm how do i get the size of this structure properly: static const MVMint32 uni_seq_16[] = {0x1F487,0x1F3FF}
07:57 samcv from const MVMint32 * uni_seq =  uni_seq_enum[result->structitem];
07:57 samcv maybe there is not a way. or maybe there is. i could always store the number of codepoints as the first item in the structure if i cannot
07:58 samcv uni_seq_enum[] just stores pointers to the uni_seq_xx
07:58 brrt what do you mean by 'size of this structure'
07:59 brrt i see no reason why sizeof() wouldn't give you the right thing
07:59 brrt namely, 8
08:00 samcv ah yeah. i see what i was doing wrong
08:23 samcv brrt, well i think i got it
08:23 samcv well not the size part yet. but the other crashy
08:26 zakharyas joined #moarvm
08:27 samcv brrt, sizeof(uni_seq)/sizeof(MVMint32); gives me half the size i want
08:28 samcv that's what i thought would give me the number of items in the array, but it returns a number half that
08:28 samcv why is this?
08:28 samcv because uni_seq is a 64 bit pointer?
08:29 lizmat that would be my first guess ?
08:30 samcv ok yeah. i don't want to divide by anything just want to dereference it
08:31 samcv getting the right number now
08:31 brrt well, you just defined uni_seq_16, not uni_seq
08:35 samcv brrt, i want to get the number of elements in the array. still not working argh
08:35 samcv sizeof(*uni_seq) gives me size of the MVMint32 type because uni_seq is MVMint32 *
08:37 arnsholt If uni_seq is declared as a pointer, there's no way to figure out the length of the array
08:37 samcv ok that's what i thought originally
08:38 samcv well actually i declare static const MVMint32 uni_seq_449[] = {0x1F3C4,0x1F3FB,0x200D,0x2640,0xFE0F}
08:38 samcv and then i have a struct which contains uni_seq_449 uni_seq_450 etc
08:38 samcv so i access the uni_seq_xx from the struct
08:38 brrt uhuh
08:38 brrt hmm
08:39 brrt i see
08:39 brrt no, you can't do that
08:39 samcv uni_seq = uni_seq_enum[blah];
08:39 samcv ok i will just have the 1st item be the length
08:39 * brrt was just about to suggest that
08:39 samcv yea
08:39 brrt alternatively, have a sentinel value at the end
08:39 samcv yeah
08:39 arnsholt Yeah, those are the standard solutions
08:39 arnsholt Pascal arrays (length first), or NULL-terminated
08:40 samcv i'm gonna go with length first
08:40 arnsholt Yeah, I like length first too, TBH
08:40 brrt as long as you don't forget to bias your indexes
08:42 samcv yea
08:49 arnsholt brrt: There's always "real_array = &data[1]" =)
08:50 brrt true
08:50 brrt although i've started to prefer: "real_array = data+1"
08:51 arnsholt Yeah, that'll work too
08:52 brrt register arithmetic is surprisingly elegant if you get the hang of it
08:52 domidumont joined #moarvm
08:52 * brrt imagines a thousand rustaceans fainting reading that
08:53 arnsholt Register arithmetic?
08:53 brrt eh, pointer arithmetic
08:53 brrt hehe
08:53 arnsholt Oh, right =)
08:54 arnsholt Yeah, it's not too bad once you get used to it
08:54 brrt yeah, my bad, i'm working on a blog post
08:54 arnsholt But I do think a language like Rust has the potential to kill of entire classes of problems
08:54 brrt well, it goes hand in hand with certain patterns (of memory management / layout), and if you're not into those patterns, then it's going to suck
08:54 brrt hmmm. no doubt
08:55 brrt on the other hand
08:55 arnsholt And a problem with pointer arithmetic is that if you fuck it up, all kinds of weird shit can happen
08:55 brrt i've had to fix many, many errors in the register allocator before it worked
08:55 brrt i think just 2 of these were actual honest memory corruption / overflow errors
08:56 brrt and they were swiftly caught by ASAN
08:56 moritz dishonest memory corruption errors are the worst :-)
08:56 brrt one of these was actually a data-structure-and-algorithm-choice error, at the root of it
08:56 brrt the other was a noninitialized value
08:56 brrt all other issues were logic issues
08:56 brrt so.....
08:57 brrt it's undoubtedly true that rust relieves programmers of whole classes of errors
08:57 samcv ok it actually really works now \o/
08:57 brrt what is not so self-evident is that those classes of errors are the most frequent or most important errors
08:57 brrt samcv++
08:57 brrt (although I guess you could point to a number of CVE's which prove me wrong)
08:58 brrt on the other, other hand
08:58 brrt renember shell-shock
08:58 brrt nothing buffer overrunny about that
08:58 brrt was a logic error
08:58 arnsholt Yeah, Rust won't save you from those
08:58 brrt rust won't save you from phishing, either
08:58 arnsholt Nope
08:59 brrt so i'm *a bit* annoyed about the hype surrounding 'rust = safety'
08:59 brrt that doesn't mean i don't want to try it out sometime :-)
08:59 arnsholt I think it's not too far off the mark
08:59 brrt it's a correct statement. it is the hype which is unreasonable
08:59 arnsholt Especially when you get things like memory shenanigans in file(1) and friends
08:59 arnsholt Yeah, hype is hype, I guess
09:00 brrt (this too shall pass :-))
09:03 moritz it's really that Rust offers compile-time abstractions without (much) of a runtime cost
09:03 moritz without the crazy subtle semantics that C++ has, too :-)
09:03 brrt that's pretty cool, yes
09:09 pyrimidine joined #moarvm
09:24 brrt joined #moarvm
09:28 samcv now just time to make spectest :)
09:32 brrt make spectest, not segv
09:32 samcv heh
09:37 jnthn joined #moarvm
09:37 Util joined #moarvm
09:37 mst joined #moarvm
09:37 nine_ joined #moarvm
09:39 nwc10_ joined #moarvm
09:39 moritz_ joined #moarvm
09:39 ggoebel joined #moarvm
09:40 camelia joined #moarvm
09:44 japhb joined #moarvm
09:53 jnthn moarning o/
09:53 samcv morning jnthn
10:02 brrt moarning jnthn
10:03 * jnthn catches up with backlog here and on #perl6-dev to see what happend during the night :)
10:03 jnthn samcv: So, any leftover questions, or is it now at the point of "review my PR"? :)
10:03 samcv yeah. review my PR :-D
10:03 samcv it works fully and is gud
10:04 samcv let me know if there's something i did you don't like though
10:05 samcv spectest just finished and pass
10:05 jnthn Alrighty
10:05 samcv oh there's one thing MVM_string_from_grapheme i just copied it into that file
10:06 samcv other than that
10:06 jnthn Working example: nqp-m -e "say(nqp::getstrbyname(''person golfing: medium-light skin tone'))"
10:06 samcv err maybe it was already there
10:06 jnthn I...uh...doubt this works, due to the extra quote at the start? :)
10:06 samcv err wait where is it from
10:06 samcv lies!
10:07 jnthn From the PR description ;)
10:07 samcv heh yeah whatever the double quotes
10:08 samcv oh i know where i stole it from
10:08 samcv it's MVM_string_chr except without checking to make sure there are no negative graphemes :)
10:09 samcv maybe should have MVM_string_chr call that one? anyway check out the PR and let me know
10:09 samcv (so we don't duplicate code)
10:09 samcv and move it to ops.c or something
10:13 pyrimidine joined #moarvm
10:26 jnthn Yeah, currently reviewing
10:38 jnthn OK, review done
10:39 samcv This is called string_from_grapheme, but actually is taking a codepoint, which is not always a grapheme.
10:39 samcv but uhm. it takes both?
10:39 samcv synthetic and non synthetic's
10:39 samcv idk what it should be called then
10:39 jnthn That's not what grapheme means.
10:39 jnthn Grapheme means "in NFG form"
10:40 jnthn The positive integers of the NFG representation all just happen to align with NFC codepoints.
10:40 samcv so what are the negative ones?
10:40 jnthn Also graphemes
10:40 samcv those are graphemes yes?
10:40 jnthn Yes
10:41 jnthn We use "synthetics" to talk about the negatives.
10:41 jnthn But I think the routine being called string_from_grapheme is fine
10:41 samcv ok
10:41 jnthn It should just take MVMGrapheme32 and it doesn't need to run it through the normalizer at all
10:41 jnthn Because it's already NFG
10:42 jnthn Note that while having such a function in MoarVM is fine, we shouldn't expose that one directly to the outside world
10:42 samcv yeah
10:42 jnthn (We never expose synthetics, because we don't want people to rely on their integer values.)
10:42 samcv yep
10:42 samcv uhm so how do i do it without MVM_unicode_normalizer_process_codepoint
10:43 samcv i tried without it but i kept running into issues
10:43 jnthn Which "it"? :)
10:44 jnthn How to implement string_from_grapheme?
10:44 samcv uh adding to buffer.
10:44 samcv also yes that
10:44 samcv err. no.
10:45 samcv but also i'm using MVM_unicode_normalizer_process_codepoint just because i don't want any issues if we run into the cases where we don't correctly break in emoji sequences
10:45 samcv there are still a few that don't work properly
10:45 jnthn I'd just change the signature to take `MVMGrapheme32 g` and then get rid of the use of the normalizer
10:45 jnthn And then it's already correct
10:46 samcv ok how do i do it without the normalizer?
10:46 jnthn Because s->body.storage.blob_32[0] = g; does the right thing
10:46 jnthn (What I just said is about inside of string_from_grapeheme)
10:47 jnthn It's totally reasonable to use the normalizer in MVM_unicode_string_from_name
10:47 samcv oh just don't use it in MVM_string_from_grapheme
10:47 jnthn Right
10:47 jnthn Because by the time you call that you already have a grapheme :)
10:47 samcv and this will also work if i have multiple graphemes reight?
10:48 samcv well er probably not
10:48 jnthn Well, not at the moment, because the signature is MVMGrapheme32
10:48 samcv but can cross that road when we come to it
10:48 jnthn Yeah
10:48 jnthn Though fixing it now isn't so hard
10:49 jnthn Lemme find a good example
10:49 samcv ok
10:49 jnthn https://github.com/MoarVM/MoarVM/blob/master/src/strings/normalize.c#L88
10:49 jnthn This function actually already nearly does what you ned
10:49 jnthn *need
10:50 jnthn It's just that it takes an MVMObject * as its input and pulls data out of that
10:50 samcv yes i saw that
10:50 jnthn But we could split it into two parts
10:50 jnthn One that works on a C-level array
10:50 samcv that would be cool
10:50 jnthn And takes a length
10:50 jnthn And then you can just feed the codepoint array you've got into it
10:50 samcv yeah i had seen that function but it didn't do exactcly what i wanted
10:51 samcv well my array's 1st item is the number of items in it, but i can always move the pointer by 1
10:51 samcv and already have the length
10:51 jnthn Sure, just move the pointer by 1 and pass in that and the length
10:51 jnthn Though I was a tad confused about the length
10:52 samcv hm?
10:52 jnthn Whether it includes the element specifying the length or not
10:52 samcv no it does not
10:52 samcv it's the number of codepoints
10:52 jnthn for (int i = 1; i < array_size; i++) {
10:53 jnthn So isn't this an off-by-one, or do I need another coffee? :)
10:53 jnthn (If we start at 1 to skip the length, then it'd need to be <= ? )
10:53 samcv nope
10:54 samcv almost certain not
10:54 samcv i thought a similar thing at first but, that is off by one if i do <=
10:54 samcv but i will 2x check
10:56 jnthn m: my @a = 2, 100, 101; my $array_size = @a[0]; loop (my $i = 1; $i < $array_size; $i++) { say @a[$i] }
10:56 camelia rakudo-moar ed5c86: OUTPUT«100␤»
10:59 dalek MoarVM: 8bfbb0e | jnthn++ | src/gc/orchestrate.c:
10:59 dalek MoarVM: Tweak full collection criteria in heap profiling.
10:59 dalek MoarVM:
10:59 dalek MoarVM: The recording of heap snapshots will of course use memory, which will
10:59 dalek MoarVM: throw off the RSS heuristic and make us a *lot* less likely to ever do
10:59 dalek MoarVM: a full collection, distorting the profiles. This is also a bit of a
10:59 dalek MoarVM: distortion (to more regular heap profiles being taken), but it's an
10:59 dalek MoarVM: improvement. (To do better, we could try tracking RSS before/after
10:59 dalek MoarVM: snapshots and excluding that memory from the calculation. Patches
10:59 dalek MoarVM: welcome if anyone tries it and finds that a viable appraoch.)
10:59 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/8bfbb0ef47
10:59 dalek MoarVM: 68b5e35 | jnthn++ | src/profiler/heapsnapshot.c:
10:59 dalek MoarVM: Null-check the *correct* thread's ->cur_frame.
10:59 dalek MoarVM:
10:59 samcv ok it is off by one now jnthn i must have changed something else
10:59 dalek MoarVM: 539346d | jnthn++ | src/io/ (3 files):
10:59 dalek MoarVM: Take into account actual allocated of I/O buffers.
10:59 dalek MoarVM:
10:59 dalek MoarVM: It seems libuv suggest we allocate 64KB sometimes, even when the
10:59 dalek MoarVM: input we get is tiny. While I'm not sure second-guessing it is wise,
10:59 dalek MoarVM: we should at least be honest internally about what's allocated. By
10:59 dalek MoarVM: storing the actual allocated size, the GC can track it as part of
10:59 dalek MoarVM: the gen2 promotion statistics, and be smarter about triggering full
10:59 dalek MoarVM: collections. This reduces memory overhead.
10:59 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/539346da66
10:59 dalek MoarVM: 80c8044 | jnthn++ | src/io/ (3 files):
10:59 dalek MoarVM: Merge pull request #488 from MoarVM/more-pressure
10:59 dalek MoarVM:
10:59 dalek MoarVM: Take into account actual allocated of I/O buffers.
10:59 jnthn samcv: Phew, I didn't need stronger coffee after all :-)
11:00 zakharyas joined #moarvm
11:00 samcv you still need stronger coffee though
11:00 samcv just because why not
11:01 jnthn The stuff I'm drinking now is quite a bit weaker than my regular...
11:01 jnthn I was given a box set of coffees at Christmas.
11:02 jnthn I'm used to drinking a 5. If I found the 3 a bit weak, I dunno what I'll make of the 1s. :P
11:02 jnthn Hm, let's switch to using Geth instead of dalek here...seems to be working fine for other projects
11:04 Geth MoarVM: 4d87b1cc70 | (Jonathan Worthington)++ | src/spesh/candidate.c
11:04 Geth MoarVM: Free up spesh log slots after specialization.
11:04 Geth MoarVM:
11:04 Geth MoarVM: Spesh logging keeps values alive, preventing the GC from collecting
11:04 Geth MoarVM: them. It logs values to sample what types show up, which is fine, but
11:04 Geth MoarVM: we should not hang on to them beyond the point the specializer has
11:04 Geth MoarVM: used them in its analysis. This reduces memory overhead, perhaps
11:04 Geth MoarVM: quite notably in some applications that have large objects (for
11:04 Geth MoarVM: example, RT #130494 leaked many objects in this way). On CORE.setting
11:04 Geth MoarVM: compilation it saves ~3MB - not much in the scheme of things, but nice
11:04 Geth MoarVM: to win.
11:04 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/4d87b1cc70
11:04 Geth MoarVM: c670eadf6b | (Jonathan Worthington)++ | src/spesh/candidate.c
11:04 Geth MoarVM: Merge pull request #490 from MoarVM/free-spesh-log-slots
11:04 Geth MoarVM:
11:04 Geth MoarVM: Free up spesh log slots after specialization.
11:04 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/c670eadf6b
11:05 jnthn Nice...now our commits are reported by a bot running on MoarVM :)
11:05 samcv jnthn, That's reasonable, but we should at the very least stick in an assert that we really get 0 back from this.
11:05 samcv what do we want to do in case it's not 0?
11:05 samcv return empty string?
11:06 jnthn Well, if the plan is that we'll re-use the code inside of MVM_unicode_codepoints_to_nfg_string it'll be fine
11:06 jnthn Since it handles cases where the sequence produces multiple graphemes.
11:07 jnthn If it's non-zero it'd mean we were about to silently lose a grapheme.
11:08 samcv yeah
11:08 jnthn But really, I'd break MVM_unicode_codepoints_to_nfg_string into two pieces
11:08 jnthn Everything below input_codes = ((MVMArray *)codes)->body.elems; can be factored out
11:08 jnthn And then called as with input and input_codes
11:09 samcv can this be done later?
11:12 jnthn I guess, but it'd avoid the need to introduce MVM_string_from_grapheme and resolve all the issues I had in MVM_unicode_string_from_name except the off-by-one :)
11:13 jnthn And result in less code overall
11:14 samcv i will look into it tomorrow most likely since it's 3am here now
11:15 samcv we won't need MVM_string_from_grapheme then anymore right?
11:16 samcv also aside from splitting unicode_codepoints_to_nfg_string, i think i've made all the changes you requested now
11:19 jnthn Right
11:19 jnthn OK, sounds good.
11:19 jnthn Rest well :)
11:21 timotimo o/
11:21 samcv not asleep yet :P
11:21 samcv but i'm mostlyish done coding for the day
11:24 timotimo i do wonder what causes all our bots to consume more and more memory
11:24 timotimo i'd need to run them myself to figure that out
11:25 jnthn Well, lemme merge work-lifetime first :)
11:25 jnthn My first attempt to rebase stuff to clean up resulted in SEGV...
11:26 arnsholt Whee! =)
11:27 samcv hehehe
11:27 timotimo wow,oops
11:28 jnthn I probably did something silly :)
11:28 jnthn Works on second attempt
11:29 jnthn All I intended to do ws trip out a commit that shouldn't have been in and aprt of another.
11:29 timotimo i think i already asked for it a long time ago ... someone could implement abs_i for our jit and it'd positively impact something inside commitable
11:29 timotimo i mean, i already mentioned abs_i could be done
11:29 timotimo but i don't know how to do that properly
11:32 jnthn Aww, where went Geth?
11:32 arnsholt Ping timeout, apparently
11:32 jnthn Anyway, just pushed the rebase of work-lifetime fixing the thing timotimo++ mentioned ;)
11:33 timotimo wait, i mentioned what? ;)
11:33 timotimo oh the typo?
11:33 timotimo i mean ... switcho? switcheroo?
11:33 samcv work-lifetime sounds sort of ominous. as if that concludes all work jnthn will do on mvm lol
11:33 jnthn Yeah, that.
11:33 jnthn :D
11:33 samcv work-lifetime pushed. nothing more to do!
11:36 jnthn Yup. All done. Now I can go to the Alps and spend my days sipping beer and enjoying the view. :)
11:37 jnthn Well, NQP and Rakudo builds seem happy post-rebase
11:37 timotimo MoarViem
11:38 jnthn At first I was like "huh, got a few seconds slower again??", then realized I've got IntelliJ running outside of the VM which is probably hogging an amount of resources...
11:38 pyrimidine joined #moarvm
11:38 * brrt feels for jnthn's computer
11:40 jnthn It leads a busy life :)
11:41 Geth joined #moarvm
11:41 notviki aww
11:42 notviki Ping timeout... unsure why
11:44 samcv jnthn, how to name MVM_unicode_codepoints_to_nfg_string that takes in a unicode string
11:44 samcv err that takes a c array
11:44 samcv can i just uhm. make a new one and change tho op mapping
11:44 timotimo just put a _v at the end, just like OpenGL uses :P
11:45 samcv v?
11:45 timotimo "vector"
11:45 samcv no i get that but
11:45 samcv but why vector
11:45 timotimo another word for contiguous array
11:45 samcv it's two dimensional i guess… but
11:45 timotimo wow. i was hoping to find an example by searching for "glgetv", but it seems like there's shoes that are called that
11:47 jnthn samcv: I'd leave the original one as is and call the factored out bit MVM_unicode_codepoints_c_array_to_nfg_string or so :)
11:48 jnthn Seems work-lifetime is good for merge :)
11:48 brrt \o/
11:49 * lizmat is looking forward :-)
11:51 * brrt apparently can't write short blog posts….
11:52 timotimo it is difficult
11:52 samcv i will have to make a blog post once this is all done on all this unicode things
11:52 brrt i'd be interested in that. are you syndicated on pl6anet.org?
11:53 timotimo ... "The Syndicate" title theme song plays in the distance ...
11:53 samcv nope brrt how do i get that
11:54 timotimo i think stmuk can add your .rss to the list
11:54 brrt you should ask moritz, I think
11:55 * moritz can't do anything on pl6anet
11:55 moritz yes, stmuk is the one to talk to
11:56 brrt (pointer following :-))
11:57 notviki samcv: just add yours to this file: https://github.com/stmuk/pl6anet.org/blob/master/perlanetrc
11:57 samcv sweet
11:58 moritz ooh, nice
11:58 moritz maybe add a link to the github repo to the website, while you're at it? :-)
12:03 Geth MoarVM: samcv++ created pull request #493: Refactor MVM_unicode_codepoints_to_nfg_string
12:03 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/pull/493
12:03 samcv woah. fancy
12:03 samcv anyway jnthn here you go
12:04 samcv spectest almost done completing, so should be ready to merge if you have no problems with it
12:07 samcv ok spectest pass. that one is ready for Merge
12:07 timotimo nice
12:11 jnthn Travis is having a go slow...
12:21 * jnthn spectests a fix for https://github.com/MoarVM/MoarVM/issues/482
12:21 jnthn lunch, bbi30
12:24 samcv jnthn, fixed now. also i've rewritten the new get_string_from_name or whatever it's called to use the new function
12:26 samcv will rebase the string from name one once the newest PR is accepted
12:42 pyrimidine joined #moarvm
12:50 * lizmat needs some help with a codegen issue in Actions
12:50 samcv night all o/
12:50 Geth MoarVM/master: 17 commits pushed by jnthn++
12:50 Geth MoarVM/master: review: https://github.com/MoarVM/MoarVM/compare/ee9c9f962b…f712c6a777
12:51 lizmat good night, samcv
12:51 jnthn samcv: Newest PR looks good, I will merge it once Travis chcks OK
12:51 jnthn Thanks; 'night o/
12:52 lizmat basically, I need to get Zop to call infix:<Z>(...,:with(op))
12:53 lizmat instead of somehow working something with METAOP_ZIP
12:53 lizmat line 6893 in Actions
12:53 lizmat oops, 6983
12:54 lizmat feels like I'm trying to work this at the wrong place
12:54 lizmat oddly enough, bare Z does codegen to a direct call to &infix:<Z>
12:55 lizmat would appreciate any help there  :-)
13:00 Geth MoarVM/utf8-c8-boundary-fix: 9475d8db4c | (Jonathan Worthington)++ | src/strings/utf8_c8.c
13:00 Geth MoarVM/utf8-c8-boundary-fix: Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8
13:00 Geth MoarVM/utf8-c8-boundary-fix:
13:00 Geth MoarVM/utf8-c8-boundary-fix: In the last round of tweaks to UTF8-C8, we fixed some sequences that
13:00 Geth MoarVM/utf8-c8-boundary-fix: would not round-trip properly due to being mis-represented in UTF8.
13:00 Geth MoarVM/utf8-c8-boundary-fix: The fix dealt with those cases, but was a bit too sweeping. UTF8-C8
13:00 Geth MoarVM/utf8-c8-boundary-fix: aims to decode everything that's both valid UTF8 and in NFC as the
13:00 Geth MoarVM/utf8-c8-boundary-fix: UTF8 decoder would, and express everything else as synthetics that
13:00 Geth MoarVM/utf8-c8-boundary-fix: will ensure round-tripping. This fix deals with the issue raised in
13:00 Geth MoarVM/utf8-c8-boundary-fix: MoarVM Issue #482, while not regressing any of the UTF8-C8 roundtrip
13:00 Geth MoarVM/utf8-c8-boundary-fix: tests.
13:00 Geth MoarVM/utf8-c8-boundary-fix: review: https://github.com/MoarVM/MoarVM/commit/9475d8db4c
13:01 Geth MoarVM: jnthn++ created pull request #494: Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8
13:01 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/pull/494
13:01 jnthn arnsholt: I think https://github.com/MoarVM/MoarVM/pull/494 does what you were suggesting; please take a glance if you've a moment :)
13:01 jnthn lizmat: What does your diff to do it look like?
13:02 jnthn I'd expect it to be mostly setting .named('with')
13:02 lizmat well, yes and no:
13:02 lizmat I think the thinko I made is that METAOP_ZIP returns a block that takes a lol
13:02 jnthn Yes
13:02 lizmat whereas &infix:<Z>:with returns a Seq
13:02 jnthn Oh
13:02 jnthn Yes
13:03 jnthn So it won't work to do that simple rewrite
13:03 lizmat indeed
13:03 jnthn We need the extra level of thunk for nested meta-ops
13:03 lizmat why?  it apparently isn't needed for a bare Z?
13:04 lizmat or do you mean a ZZ ?
13:04 lizmat or a ZX ?
13:05 jnthn Yes, any of those
13:05 jnthn Or ZZ :)
13:05 lizmat ah, so maybe I should codegen a call to Rakudo::Internals.ZipIterator... directly
13:07 jnthn Will that return a block?
13:07 lizmat atm that returns a Seq
13:07 lizmat no, Iterator
13:08 arnsholt jnthn: Yeah, that looks right to me!
13:08 jnthn arnsholt: OK, thanks. :)
13:09 jnthn lizmat: I think whatever we code-gen meta-ops to, we'd need to have it be a block except at the top level
13:09 jnthn lizmat: We may be able to do smarter at the top level
13:09 jnthn (But would also need a mechanism to detect it)
13:10 lizmat ok, lemme digest that for a bit  :-)
13:10 jnthn OK
13:10 jnthn Going to switch to $other-job for a bit :)
13:11 lizmat thanks so far!
13:12 jnthn But will be about :)
13:12 jnthn Will merge stuff when Travis is happy
13:12 jnthn And bump MOAR/NQP revisions, so hopefully everyone can enjoy the fixes :)
13:14 timotimo for cases like my &bleh = &[Z,] and such
13:14 jnthn Oh, that also :)
13:47 Geth MoarVM: dd7d4d086d | (Samantha McVey)++ | 2 files
13:47 Geth MoarVM: Refactor MVM_unicode_codepoints_to_nfg_string
13:47 Geth MoarVM:
13:47 Geth MoarVM: Seperate out the section which involves MVMObject so we can re-use this
13:47 Geth MoarVM: function in other places with native c data structures.
13:47 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/dd7d4d086d
13:47 Geth MoarVM: 37bb9737bd | (Jonathan Worthington)++ | 2 files
13:47 Geth MoarVM: Merge pull request #493 from samcv/MVM_unicode_codepoints_to_nfg_string
13:47 Geth MoarVM:
13:47 Geth MoarVM: Refactor MVM_unicode_codepoints_to_nfg_string
13:47 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/37bb9737bd
13:49 brrt something extra to ponder
13:50 brrt how am i going to extend the linear scan allocator (and the expr jit in general) to work with SSE registers
13:50 brrt im not at all sure that the rex byte will work for those
13:51 brrt matter of fact
13:51 brrt i know nothing about SSE registers and their encoding
13:52 jnthn I figure this is something we can worry about once we've got stuff working at all
13:52 jnthn (as in, post-merge)
13:52 jnthn afaik we don't have any code that uses those today?
13:52 jnthn So we won't miss out on anything?
13:52 jnthn (anything we're already getting, that is)
13:57 brrt yes, definitely
13:57 brrt but i make plans long ahead
13:58 brrt :-)
13:58 brrt i've more or less figured out how to implement ARGLISt
13:59 brrt as I said, that's the last essential bit before we can really consider merging
13:59 brrt by the way, the current JIT *does* work with SSE registers
13:59 jnthn Oh? What for ooc?
14:00 brrt for floating point calculations :-)
14:00 brrt the alternative would be x87 coprocessor calculations. don't use those
14:00 jnthn heh, I didn't realize we weren't using those :P
14:01 brrt well, it's only for a few things
14:01 jnthn Ah, just some floating point ops?
14:01 jnthn So the basic things like + doesn't use them?
14:01 jnthn Compilation completed successfully with 3,719 warnings in 20m 25s 687ms (moments ago)
14:02 jnthn oops
14:02 jnthn ww
14:03 brrt :-)
14:04 brrt no, regular integer addition doesn't
14:04 jnthn But flaoting point addition?
14:04 brrt floating point addition does
14:04 * brrt looks for example
14:07 brrt https://github.com/MoarVM/MoarVM/blob/even-moar-jit/src/jit/x64/emit.dasc#L987
14:08 jnthn Hm
14:09 jnthn OK, so we probably do need to think about that at some point sooner rather than later if we want nice JIT of floating point code :)
14:09 brrt well, yeah
14:09 brrt i don't expect a terror, though
14:09 brrt i may need to extend dasm a bit again
14:10 brrt but the allocator shouldn't have to change (much)
14:11 brrt an extra stack for the additional registers, a few extra definitions, and some more care in accessors...
14:13 pyrimidine joined #moarvm
14:17 brrt oh, and passing floating point args, of course
14:22 pyrimidine joined #moarvm
14:25 lizmat so, why is there no METAOP_REDUCE_NON, and why doesn't the lack of that not break Z.. ?
14:26 lizmat m: find-reducer-for-op(&[..])
14:26 camelia rakudo-moar ed5c86: OUTPUT«No such symbol '&METAOP_REDUCE_NON'␤  in block <unit> at <tmp> line 1␤␤Actually thrown at:␤  in block <unit> at <tmp> line 1␤␤»
14:29 moritz m: say 1 Z.. 2
14:29 camelia rakudo-moar ed5c86: OUTPUT«(1..2)␤»
14:29 moritz wow
14:30 lizmat looks to me these operators only can take 2 iterators
14:30 lizmat ever
14:30 lizmat m: say 1 Z.. 2 Z.. 3
14:30 camelia rakudo-moar ed5c86: OUTPUT«Range objects are not valid endpoints for Ranges␤  in block <unit> at <tmp> line 1␤␤»
14:31 lizmat m: dd 1 Zcmp 2 Zcmp 3
14:31 camelia rakudo-moar ed5c86: OUTPUT«(Order::Less,).Seq␤»
14:31 lizmat m: dd 1 Zcmp 2 Zcmp 1
14:31 camelia rakudo-moar ed5c86: OUTPUT«(Order::Less,).Seq␤»
14:31 lizmat m: dd 1 Zcmp -1 Zcmp 1
14:31 camelia rakudo-moar ed5c86: OUTPUT«(Order::Same,).Seq␤»
14:31 lizmat m: dd 1 Zcmp 1 Zcmp -1
14:31 camelia rakudo-moar ed5c86: OUTPUT«(Order::More,).Seq␤»
14:32 lizmat m: dd 1 Zcmp 1 Zcmp 0
14:32 camelia rakudo-moar ed5c86: OUTPUT«(Order::Same,).Seq␤»
14:32 lizmat m: dd 1 Zcmp 2 Zcmp 0
14:32 camelia rakudo-moar ed5c86: OUTPUT«(Order::Less,).Seq␤»
14:32 lizmat m: dd 1 Zcmp 2 Zcmp -1
14:32 camelia rakudo-moar ed5c86: OUTPUT«(Order::Same,).Seq␤»
14:33 lizmat yeah, that feels faulty
14:51 brrt left #moarvm
14:51 brrt joined #moarvm
16:02 domidumont joined #moarvm
16:04 zakharyas joined #moarvm
16:58 TimToady yeah, should disallow more than 2 for non-assocs
16:59 zakharyas joined #moarvm
17:25 zakharyas joined #moarvm
17:59 pyrimidine joined #moarvm
18:27 Geth MoarVM: 9475d8db4c | (Jonathan Worthington)++ | src/strings/utf8_c8.c
18:27 Geth MoarVM: Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8
18:27 Geth MoarVM:
18:27 Geth MoarVM: In the last round of tweaks to UTF8-C8, we fixed some sequences that
18:27 Geth MoarVM: would not round-trip properly due to being mis-represented in UTF8.
18:27 Geth MoarVM: The fix dealt with those cases, but was a bit too sweeping. UTF8-C8
18:27 Geth MoarVM: aims to decode everything that's both valid UTF8 and in NFC as the
18:27 Geth MoarVM: UTF8 decoder would, and express everything else as synthetics that
18:27 Geth MoarVM: will ensure round-tripping. This fix deals with the issue raised in
18:27 Geth MoarVM: MoarVM Issue #482, while not regressing any of the UTF8-C8 roundtrip
18:27 Geth MoarVM: tests.
18:27 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/9475d8db4c
18:27 Geth MoarVM: f9e14e9ca8 | (Jonathan Worthington)++ | src/strings/utf8_c8.c
18:27 Geth MoarVM: Merge pull request #494 from MoarVM/utf8-c8-boundary-fix
18:27 Geth MoarVM:
18:27 Geth MoarVM: Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8
18:27 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/f9e14e9ca8
18:29 pyrimidine joined #moarvm
18:45 camelia joined #moarvm
18:48 camelia joined #moarvm
18:54 pyrimidine joined #moarvm
19:01 Geth joined #moarvm
19:13 brrt joined #moarvm
19:13 brrt ohai #moarvm
19:13 brrt i'm writing a longish blog post on the new register allocator and i've figured out a bug
19:14 brrt it is an extremely annoying bug, which is why i want to tell you about it
19:15 brrt if you read the literature about linear scan, the received wisdom is: expire registers prior to allocating a new one, so that if one of the input registers has it's last use in to create this live range, you can reuse it's register
19:15 brrt especially for two-operand instruction sets like x86-64, that's great, because that matches well with how the architecture works
19:16 brrt however, to get that effect, you need to arrange registers in a stack, not a register buffer
19:16 brrt s/register buffer/ring buffer/
19:17 brrt so, that's one thing, but things get worse
19:17 brrt suppose you have no registers left and need to spill a value
19:18 pyrimidine joined #moarvm
19:18 brrt suppose you pick to spill a value which is used for the next instruction (i.e. where the new live range stats)
19:19 brrt so then we split the live range into 'atomic' ranges
19:19 brrt since the new 'atomic' range is not in the past, it can't be retired, and must be put on the worklist
19:20 brrt however, once that's done, it can be immediately expired
19:22 brrt so suppose i have *two* such 'atomic' live ranges
19:24 brrt then one is allocated, e.g. to register rcx; before I allocate the second, this is expired, rcx is returned to the stack; the second value is also loaded into rcx, and my program is wrong
19:26 brrt ... and yes, as usual, i know how to fix this
19:26 brrt but i'm *annoyed*
19:27 brrt also because the literature is just wrong about this
19:28 TimToady .oO("We don't know why your Fortran program crashes, but if you just throw in a few extra 'continue' statements, it should start working again.")
19:29 notviki :o
19:29 notviki This all sounds fancy pants.
19:29 zakharyas joined #moarvm
19:30 notviki brrt: is that stuff hard to learn? :)
19:30 TimToady and yes, I heard that when I was (quite a bit) younger
19:30 brrt JIT compilers have many moving parts. that makes them kind of hard to explain
19:31 brrt each of the individual things is bog-standard. binary heap, disjoint set, linked lists
19:31 brrt TimToady: how.. even
19:32 TimToady probably buffer boundary issues
19:33 brrt how does 'continue' make it work then :-o
19:33 brrt did that acutally help
19:34 brrt notviki: i'm kind of hoping my blog has some practical hints on how you can do things. and i try to keep the LoC of the jit low
19:34 brrt that's also because it's just me writing it now
19:35 notviki brrt: do you have a CS degree?
19:39 brrt (the proper fix, by the way, is to do two things; expire values *after* rather than *at* their last use; and add a special 'reuse' mechanism that checks if a register can be reused *at* it's last use; alternatively, expire a registers only once at a given code point)
19:41 brrt notviki: not really, i've a degree in environmental science :-)
19:42 brrt i've kind of learned by brute force. perhaps not the most efficient way of doing it
19:43 TimToady brrt: it doesn't have to be continue, it can be anything that shifts the positions of the rest of the program, but continue is a no-op
19:44 TimToady but it's also possible they used it to mark basic block boundaries, or some such
19:45 brrt hmm, that makes some sense
19:45 TimToady ancient Fortran optimizers were scary good, except when they weren't
19:46 brrt :-)
19:46 brrt these days i think we have a bit more theory behind it
19:46 brrt i kind of like the 'expire once per codepoint' solution best
19:46 brrt simplest to implement
19:46 notviki :)
19:47 brrt and sufficient. but brittle
19:47 mtj_ joined #moarvm
19:47 brrt otoh, everything about compilers is brittle
19:49 TimToady .oO(that's why compiler writers make peanuts...)
19:51 brrt i like peanut butter, so that's something
19:51 brrt also, i think that actual compiler writers (that know what they are doing) actually have reasonable salaries
20:17 samcv ok i'm back
20:18 samcv well uhm. i mean. good morning
20:18 samcv o/
20:23 pyrimidine joined #moarvm
20:39 brrt good * samcv
20:39 brrt evening for me
20:43 samcv i'll brb in an hour or so
20:44 * brrt in an hour or 10 or so :-)
20:44 brrt sleep &
20:50 pyrimidine joined #moarvm
20:50 jnthn o/ samcv
20:57 pyrimidine joined #moarvm
21:28 pyrimidine joined #moarvm
21:33 pyrimidine joined #moarvm
22:02 pyrimidine joined #moarvm
22:47 pyrimidine joined #moarvm
23:26 tbrowder joined #moarvm
23:30 samcv jnthn, i have rewritten it to use the new reworked string creation function. waiting for travis builds to complete now
23:30 samcv renamed it getstrfromname instead of getstrbyname, because of the existing getcpfromname
23:36 jnthn samcv++
23:36 jnthn Will look in the morning :)
23:36 samcv aww ok
23:36 jnthn Super sleepy :)
23:36 samcv k
23:40 jnthn Found a moment to look anyway
23:40 jnthn Spotted one more thing
23:40 jnthn But overall looks close
23:40 jnthn Anything else before I attempt sleep? )
23:42 samcv uhm i think that's it
23:42 jnthn OK
23:43 jnthn Be back in the morning then :)
23:43 samcv sleep well :)
23:44 jnthn Thanks...here's hoping :)
23:44 jnthn o/
23:45 samcv o/
23:53 timotimo sleep wellthn

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary