Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2016-12-27

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:40 pyrimidi_ joined #moarvm
02:48 ilbot3 joined #moarvm
02:48 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
02:52 FROGGS_ joined #moarvm
03:23 TimToady joined #moarvm
04:31 geekosaur joined #moarvm
05:10 pyrimidine joined #moarvm
08:00 domidumont joined #moarvm
08:05 domidumont joined #moarvm
11:47 dalek MoarVM: 8b31d97 | samcv++ | / (3 files):
11:47 dalek MoarVM: Fix RT #122471 and #122470 return <control-0000> for \0 and other controls
11:47 dalek MoarVM:
11:47 dalek MoarVM: RT: https://rt.perl.org/Ticket/Display.html?id=122471
11:47 dalek MoarVM: https://rt.perl.org/Ticket/Display.html?id=122470
11:47 synopsebot6 Link:  https://rt.perl.org/rt3//Public/Bug/Display.html?id=122471
11:47 synopsebot6 Link:  https://rt.perl.org/rt3//Public/Bug/Display.html?id=122470
11:47 dalek MoarVM: We now pass several tests we were not passing before in uniname.t
11:47 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/8b31d9752c
11:47 dalek MoarVM: 3dc5647 | jnthn++ | / (3 files):
11:47 dalek MoarVM: Merge pull request #469 from samcv/uniname_no1
11:47 dalek MoarVM:
11:47 dalek MoarVM: Fix RT #122471 and #122470 return <control-0000> for \0 and other controls
11:47 synopsebot6 Link:  https://rt.perl.org/rt3//Public/Bug/Display.html?id=122471
11:47 synopsebot6 Link:  https://rt.perl.org/rt3//Public/Bug/Display.html?id=122470
11:47 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/3dc56470bd
11:48 samcv thanks jnthn
11:49 jnthn Thank you! :-) I'll try and find a moment to look at #468 soon also.
11:49 samcv oh one thing though
11:49 samcv so there is this bug that has been in it for like
11:49 samcv at least 1 year. where space is not space
11:49 samcv m: say ' ' ~~ /<:space>/
11:49 camelia rakudo-moar 340bc9: OUTPUT«「 」␤»
11:50 samcv this ONLY works because it matches the space property of Line_Break
11:50 samcv and it aliases SP to space and Space
11:50 samcv if i change the alias in the unicode property _VALUE_ file, it totally breaks that
11:50 samcv so i've gotten it fixed so White_Space == space. but
11:51 samcv then it still is broken doing ' ' ~~ /<:space>/
11:51 samcv and i cannot figure out why :(
11:51 samcv but i believe thet other PR passes all spectests, and still has that bug.
11:51 samcv which also means the spectests pass :P
11:52 samcv i also encountered the opposite case, where ' ' ~~ /<:White_Space>/ would be 'wrong' (aka it would seem to work for 0x20) but then ' ' ~~ /<:space>/ would be broken. though uniprop-bool would return the correct value for both
11:53 jnthn I think I uncovered something along these lines a while back
11:53 samcv ALSO if I hack the script to change the White_Space property to the 'space' property (so that's its primary name), then change MVM_CC_WHITE_SPACE(something like that) to MVM_CC_SPACE and recompile
11:53 samcv it all works flawlessly
11:53 samcv and all tests pass
11:53 jnthn Lemme try and find it...
11:53 samcv though it is a workaround
11:54 jnthn Yeah, iirc we're being a bit too lenient on what properties we accept without a name qualifier
11:54 samcv that's the only _consistant_ way i was able to fix the bug.
11:54 samcv but it is quite bad that we think 'space' is somehow a property VALUE alias to SP
11:55 samcv m: ' '.uniprop('Line_Break').say
11:55 camelia rakudo-moar 340bc9: OUTPUT«SP␤»
11:55 samcv which is the reason <:space> works. 'space' is one of the whitespace canonical names. while the SP property, it is aliased to Space with capital
11:56 jnthn aha, this: https://github.com/perl6/specs/issues/118
11:56 jnthn I think the badness may boil down to the thing <:space> compiles into
11:57 jnthn Which iirc is a lookup asking "what property name has a value of this name" or some such, which is ambiguous.
11:57 samcv though what is weird
11:57 jnthn I ran into this when realized that *some* regenerations of of the Unicode DB failed spectests
11:57 samcv ' '.uniprop-bool('space') works
11:57 samcv but <:space> doesn't work. actually
11:57 samcv <:space> IS OPPOSITE
11:58 jnthn And then the next re-run on the same input data worked
11:58 samcv and will match non whitespace and not match whitespace
11:58 jnthn Becuase hash randomization screwed up the order
11:58 samcv yeah
11:58 jnthn I think I discovered this while trying to get it to be resistant to hash randomization and always write out thigns in the same order
11:58 samcv i notice that too
11:58 jnthn That patch may be in a branch somewhere
11:58 samcv what patch?
11:58 samcv to fix?
11:58 jnthn No
11:58 jnthn To get consistent order
11:59 samcv ah
11:59 jnthn Which...uh...provided consistent breakage.
11:59 samcv just add sort in front of all the 'keys'?
11:59 samcv :)
11:59 samcv yes
11:59 jnthn I think it needed a couple of places
11:59 samcv well i put sort everywhere keys was
11:59 samcv and then it was always broken
11:59 samcv and it was great :)
11:59 jnthn https://github.com/MoarVM/MoarVM/commit/224a261832558be52e7740b17f17f7f17661a58c
12:00 samcv but the space thing was the most pervasive problem…
12:00 samcv but ONLY in regex
12:00 samcv which i think is a regex bug
12:00 samcv related to that thing you linked
12:00 samcv <:Ll> is not a unicode property, it's a value.
12:00 samcv and other things etc
12:00 jnthn *nod*
12:00 samcv general categories are distinct though
12:00 jnthn There's a useful answer on that issue I linked also, from nova patch
12:00 samcv but who wants to match the SP property of Line_Break without specifying it
12:01 jnthn "No regex engine allows for arbitrary property values for all properties without the associated names, due to the obvious conflicts."
12:01 samcv oh Also, supporting Script instead of Script_Extension would be a mistake since the latter is generally what people expect and should be encouraged over Script. I p
12:01 samcv +1 for that
12:01 jnthn This isn't true. Ours apparently does. :P :P
12:01 jnthn But I agree it really shouldn't. :)
12:01 samcv matches extended script?
12:02 samcv do we even parse that unicode file?
12:02 jnthn No, I meant the "allows for arbitrary property values" thing I quoted.
12:02 samcv ah
12:02 samcv so i guess it looks up the SP property first
12:02 samcv SP => Space... though....
12:03 jnthn Yeah, and the reason it gets that first is, iirc, because we get lucky with the ordering. :S
12:03 samcv if i edit the unicode file and change it from SP;Space to SP; fakeeeee
12:03 samcv then it still breaks
12:03 samcv ah
12:03 samcv yes
12:03 samcv so i don't know what it's looking up for that
12:03 samcv also how to fix it so that it uh. isn't broken
12:04 samcv i have been working hard trying to get things to work
12:04 samcv and i sort of figured out where it breaks
12:04 samcv but it's like "everything is ok here" #####MAGIC HERE#####
12:04 samcv then on the other side it just is totally either screwed up totally or like ok-ishhhhhh
12:04 samcv let me find the line
12:05 jnthn I wonder if we can fix it by changing how <:Foo> (that is, just a value) is compiled
12:06 jnthn So that it just considered general category, and script extension, and nothing else.
12:06 jnthn *considers
12:07 samcv uh emit_unicode_property_value_keypairs
12:07 samcv what about boolean properties
12:07 samcv like space?
12:07 jnthn https://github.com/perl6/nqp/blob/master/src/vm/moar/QAST/QASTRegexCompilerMAST.nqp#L1360 # this is where the compilation happens, fwiw
12:08 samcv property NAMES don't interfere with script or uhm
12:08 samcv YES
12:08 samcv i looked at that
12:08 samcv ahh
12:08 samcv yep
12:08 samcv was hard to understand :P
12:09 jnthn op('unipropcode', $pcode, $pname),
12:09 jnthn op('unipvalcode', $pvcode, $pcode, $pname),
12:09 samcv but i think we should match binary properties
12:09 jnthn That I think is where it's a bit dubious
12:09 samcv script names
12:09 samcv uh
12:09 samcv and general category
12:09 samcv and probably script extensions too
12:09 jnthn Are binary properties reliably unambiguous with general category and script extenion?
12:10 jnthn *extension
12:10 samcv yes
12:10 samcv unles you count Sc and sc
12:10 samcv that is the only exception
12:10 samcv but you shouldn't changecase for names that are only 2 letters anyway
12:10 * jnthn wonders if we'll end up wiht more exceptions in the future :)
12:11 samcv uh
12:11 samcv that's not really an exception
12:11 samcv you're only supposed to allow lowercase and no underscore for names that have an underscore
12:11 samcv prettty sure
12:11 samcv will have to see which TR said that
12:12 samcv so like WSpace, space, White_Space are official names. so you can do also
12:12 samcv whitespace, white_space
12:12 samcv or WhiteSpace
12:13 jnthn Ah, I see.
12:13 samcv actually i think the better rule is the stuff in the 2nd column of the unicode property aliases file
12:14 samcv anything in the 2nd column, the long name, you can do that with
12:14 samcv for sure
12:14 samcv but the 1st column you can't
12:14 samcv let me try and find it
12:15 pyrimidine joined #moarvm
12:15 samcv Loose matching should be applied to all property names and property values, with
12:15 samcv # the exception of String Property values.
12:15 samcv With loose matching of property names and
12:15 samcv # values, the case distinctions, whitespace, and '_' are ignored. For Numeric Property
12:15 samcv # values, numeric equivalencies are applied: thus "01.00" is equivalent to "1"
12:16 jnthn Last I looked at the script, I think we cheated in case a bit also.
12:17 samcv yeah it lowercases them
12:17 samcv takes out _ etc
12:17 jnthn Yup
12:18 jnthn Just covers the common ways you might write it
12:18 jnthn But not truly case insensitive
12:18 samcv but the property value and property name aliases shouldn't be in the same stnructure maybe? idk
12:18 jnthn That's probably the least of our troubles at the moment, however. :)
12:18 samcv seemed weird
12:18 jnthn Yeah
12:18 samcv yeah :)
12:18 jnthn I guess...
12:19 samcv it creates a hash with the data there right?
12:19 samcv so there would be collisions or?
12:19 jnthn At the point we compile the regex we can actually case-analyze <:Foo> for if it's a general category name, a script extension name, or a boolean property name
12:19 samcv is it just a normal kind of hash, with keys and values?
12:20 jnthn I think it's actually not a hash but instead does some kind of binary search
12:20 jnthn But I may be misremembering
12:21 samcv also i'm not sure how it looks up the property names
12:21 samcv for a given name
12:21 jnthn udc2c.pl and the Unicode database stuff is one of the handful of bits of MoarVM that I didn't either write in the first place or significantly rewrite somewhere along the lines.  :-)
12:21 samcv and what the numbers in unicode_property_value_keypairs mean
12:21 samcv heh
12:21 jnthn And its workings have mystified me a few times too :P
12:22 jnthn Also I didn't touch it for a few months. I think that it boils down to something like: each char in the database has a bitfield, which stores bit-packed representations of property values
12:23 jnthn So looking up a property name I believe resolves to an index into a table that specifies the relevant bits to extract
12:24 samcv dammit rt.perl.org isn't posting my emails
12:24 jnthn And then the integer those bits make up provides a way to do a lookup in a property values table
12:24 samcv yeah that's what i sort of thought
12:24 samcv though. does the order of the pairs in it matter
12:24 samcv is what i want to know
12:24 samcv (in the C file)
12:25 samcv i mean i only got everything working fine when the numbers were the same
12:25 samcv like when i manually changed the in the script (when it saw 'White_Space' it changed it to 'space')
12:26 samcv then all the 'space' and "White_Space" whatever pairs were the same numbers for both the property and the property value datastructures in unicode_db.c
12:26 samcv and i've noticed all the ones that i had to workaround in rakudo didn't match either. so
12:26 samcv but i still don't know if the order matters at all, or maybe it shouldn't matter if all of them don't collide
12:27 samcv because i've seen the same key, have a different value in the same structure. so higher up would be {"space", 21} then lower down {"space", 120}
12:28 jnthn Aha, reading MVM_unicode_get_property_str in unicode_db.c is somewhat informative
12:28 samcv jnthn, https://github.com/MoarVM/MoarVM/blob/master/src/strings/unicode_ops.c#L139
12:28 jnthn (And int)
12:28 samcv yeah it is
12:28 samcv but i want to know how it gets the values aside from that.
12:29 jnthn Oh yes, that loosk familiar
12:29 samcv tell me what it does!
12:29 samcv well
12:29 jnthn heh
12:29 jnthn It makes a hash
12:29 samcv what happens if there are multiple same keys
12:29 jnthn By going through a table
12:29 samcv with different values in the table
12:29 jnthn Latest entry wins
12:29 samcv ok so last one
12:29 jnthn Yup
12:29 jnthn But note it does through the table in reverse order too
12:29 samcv so keep regenerating until all the roast tests passes?
12:29 samcv :P
12:29 jnthn (For no particular reason that I can tell)
12:30 samcv ah k
12:30 jnthn Well yes, that's why regenerating passes things :P
12:30 jnthn But it's still because we're too liberal with regards to processing <:Foo> style things, so far as I understand.
12:30 samcv so yeah. if we fix that. i maybe have fixed the problem
12:30 samcv we can see i guess
12:31 samcv let me recompile that 'fixed' i think version
12:31 jnthn Yeah, my feeling is if we can fix that form to only consider general category values, script values, and boolean property names we're good.
12:31 jnthn Unless spectests rely on the previous more liberal interpretation /o\
12:32 samcv then they should feel bad!
12:32 Ven joined #moarvm
12:32 samcv i don't *think* they do
12:32 samcv but we will only know once we change it I think
12:32 jnthn Indeed.
12:32 samcv oh yeah k
12:32 jnthn Anyway, I'm +1 to changing that. I wonder if it's best to try and do it in the regex compilation
12:32 samcv yeah it is fixed
12:33 samcv even ' ' ~~ /<:space>/ works
12:33 jnthn With what fix? :)
12:33 samcv magic
12:34 samcv let me push it to my fork
12:35 samcv https://github.com/samcv/MoarVM/commit/960eefba863e91993e1bd6833d751a539ffa53a1
12:35 samcv oh wait maybe not that one idk
12:35 samcv there are two commits
12:35 samcv it at least works in the most recent one
12:35 samcv https://github.com/samcv/MoarVM/commits/working
12:36 samcv the one just called 'a' it at least works atm. let me run the two tests which caused problems
12:36 samcv through all the fiddling there were two test files that would stop pasing if things went worng
12:36 samcv i should change that from "terrible workaround" to "amazing workaround because it works"
12:37 samcv though it's not like. super great. but working is working
12:37 jnthn :-)
12:37 jnthn True
12:38 samcv oh yes they pass
12:38 samcv let me try the one before commit called 'a'
12:38 samcv just commited quickly cause it worked… haha
12:39 samcv i *think* the one i said was fully working maybe wasn't working so i made another commit? or both work
12:39 samcv idk
12:39 samcv either both work or the newest one works
12:40 samcv after working on it all day and most of the time the breakage being caused by nothing but seemly chance it gets harder to tell. but i know the change i made with the MVM_UNICODE_PROPERTY_SPACE is the only thing that didn't re-break by running it again
12:40 samcv and changing other things
12:40 samcv which is a good place to start if you can finally get it reproducible
12:41 jnthn Indeed
12:41 samcv ok no it's only the most recent one 'a' that passes
12:42 samcv where it looks like i changed it back…
12:43 samcv well. it works.
12:43 samcv let me see what else i changed in that
12:44 samcv well i ended up with a different number of keypairs
12:44 samcv that is 6 things smaller
12:44 samcv oh here jnthn https://github.com/samcv/MoarVM/commit/2f5bbb4c8430bd25fe30b8b524cf0026fe5eb302#diff-8af9c0a9829a9e5b19955068ff5cae8fL1005
12:44 samcv i remember now
12:45 samcv and then once i did that' some of the things errored so i added https://github.com/samcv/MoarVM/commit/2f5bbb4c8430bd25fe30b8b524cf0026fe5eb302#diff-8af9c0a9829a9e5b19955068ff5cae8fL1138
12:45 samcv changed this die into an if condition
12:51 jnthn o.O
12:51 jnthn Ugh. What a headache. :S
12:51 samcv yes
12:55 jnthn Oh heck, thinking about the <:Foo> code-gen again...
12:55 jnthn op('unipropcode', $pcode, $pname),
12:57 jnthn It seems to be feeding a property value in there
12:57 jnthn And relying on us having polluted the property names table with property values
12:57 jnthn That is, $pname is potentally something like Ll
12:57 samcv yeah
12:58 samcv it is really. really not good
12:58 samcv values, properties? who cares! mash em together
12:58 jnthn Indeed :(
12:58 jnthn I wonder if we *only* rely on it in that one place
12:58 samcv which line
12:58 samcv well.
12:58 samcv yeah which line :D
12:58 jnthn The one in NQP code-gen that I referenced
12:59 * jnthn finds it again :)
12:59 samcv oh the one you linked? in nqp?
12:59 samcv ah
12:59 samcv yeah i remember that one
12:59 samcv i thought you meant in moar
13:00 JimmyZ jnthn: my PR 459 needs your review :)
13:01 jnthn https://github.com/perl6/nqp/blob/master/src/vm/moar/QAST/QASTRegexCompilerMAST.nqp#L1404
13:01 jnthn Note how it uses $pname in both lookups
13:01 samcv also jnthn what does this merge_ins do
13:02 samcv and what does it do with uh. these things in that list
13:02 samcv like i can see what it calls but not really what it does with it
13:02 jnthn op means "emit a MoarVM op"
13:02 samcv oh no
13:02 samcv oh ok
13:02 jnthn 'unipropcode' is the instruction name
13:02 samcv ok that makes more sense now
13:03 jnthn $pcode, $pname and registers
13:03 jnthn merge_ins just means "stick this array of instructions into this other array of instructions"
13:03 jnthn Like "append" in Perl 6
13:04 jnthn Earlier in the method we do things like
13:04 jnthn my $pcode   := $!regalloc.fresh_i();
13:04 jnthn Which allocates a register
13:05 jnthn If you want to see the output, then write a /<:Ll>/ or so, and run nqp --target=mbc --output=x.moarvm
13:05 jnthn And then moar --dump x.moarvm
13:05 jnthn You can do it with perl6 instead of nqp too, but the nqp code has less clutter around the stuff you'll want to see, and we re-use the same code-gen path for both.
13:07 jnthn JimmyZ: Will try and get to that soonish :)
13:07 jnthn Time to make lunch; bbl :)
13:07 samcv ok i just checked it _does_ fail something but it's only failing <:Greek>
13:07 samcv all the other scripts work fine
13:07 samcv idk there's something about space and greek
13:07 samcv well greek is a block AND a uh
13:08 samcv script
13:09 samcv m: my $a = "\c[GREEK LETTER SMALL CAPITAL GAMMA]"; $a.uniprop('Script').say; $a.uniprop('Block').say
13:09 camelia rakudo-moar 4724bd: OUTPUT«Greek␤Phonetic Extensions␤»
13:10 samcv yeah it fails that regex test, but it gets those properties fine with uniprop
13:10 samcv so the problem isn't in nqp
13:12 samcv butttt <:Script<Greek>> works fine
13:12 samcv so i think nqp and moar problem here
13:16 samcv thanks for that time jnthn about using nqp instead of perl6. i was dumping the perl 6 and there were so many things
13:36 samcv i gotta go to bed now. talk to you soon jnthn :)
13:42 nine /win 14
13:52 JimmyZ jnthn: thanks ;)
14:21 pyrimidine joined #moarvm
15:00 brrt joined #moarvm
15:03 stmuk joined #moarvm
15:09 dalek MoarVM/even-moar-jit: a4632df | brrt++ | / (3 files):
15:09 dalek MoarVM/even-moar-jit: Make linear_scan allocator public
15:09 dalek MoarVM/even-moar-jit:
15:09 dalek MoarVM/even-moar-jit: Also fix a number of minor things, and add some support for register
15:09 dalek MoarVM/even-moar-jit: specs. I'm not yet sure how to deal with conflicting register
15:09 dalek MoarVM/even-moar-jit: requirements.
15:09 dalek MoarVM/even-moar-jit: review: https://github.com/MoarVM/MoarVM/commit/a4632df9dd
15:09 dalek MoarVM/even-moar-jit: 842d5a7 | brrt++ | src/jit/ (2 files):
15:09 dalek MoarVM/even-moar-jit: Fix some of the more obvious bugs in linear_scan
15:09 dalek MoarVM/even-moar-jit: review: https://github.com/MoarVM/MoarVM/commit/842d5a7814
15:15 brrt nasty bugses
16:12 japhb_ joined #moarvm
18:14 dogbert17 joined #moarvm
18:35 domidumont joined #moarvm
19:27 FROGGS joined #moarvm
19:55 pyrimidine joined #moarvm
23:02 nebuchadnezzar joined #moarvm
23:55 vendethiel joined #moarvm

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary