Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2016-09-28

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:48 ilbot3 joined #moarvm
01:48 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
04:09 nebuchadnezzar joined #moarvm
04:27 zakharyas joined #moarvm
05:49 domidumont joined #moarvm
05:52 domidumont joined #moarvm
07:13 brrt joined #moarvm
08:51 jnthn D'oh. So, I wake up, shower pondering what on earth might be wrong with ucd2c.pl, grab breakfast, come to computer to read the news before the day's work...and discover it's a national holiday here today. :P
08:53 jnthn Anyway, figure I'll work the day and take some other day off next week when family visit. Otherwise I'll just spend the day sitting around wondering why a darn Unicode database import script is busted :P
08:54 nwc10 but given the good news that I belatedly read here http://news.perlfoundation.org/2016/09/jonathan-worthingtons-recent-g.html
08:54 nwc10 wasn't today a non-work-work day anyway?
08:55 nwc10 anyway, your "move the holiday" sounds like an excellent plan
08:55 * jnthn somehow suspects that in terms of technical difficulty, the stuff he does for his TPF grant is harder work than this other work :)
08:55 jnthn More fun too, though :)
09:18 dalek MoarVM/unicode9: ed685ed | jnthn++ | tools/ucd2c.pl:
09:18 dalek MoarVM/unicode9: Tweak ucd2c.pl to produce deterministic output.
09:18 dalek MoarVM/unicode9:
09:18 dalek MoarVM/unicode9: Before, the output produced depended on hash ordering, which meant
09:18 dalek MoarVM/unicode9: two runs on the exact same Unicode database could produce wildly
09:18 dalek MoarVM/unicode9: different output. This doesn't help when trying to debug.
09:18 dalek MoarVM/unicode9: review: https://github.com/MoarVM/MoarVM/commit/ed685ed8d8
09:22 dalek MoarVM/unicode9: a94e443 | jnthn++ | tools/ucd2c.pl:
09:22 dalek MoarVM/unicode9: Tweak ucd2c.pl to produce deterministic output.
09:22 dalek MoarVM/unicode9:
09:22 dalek MoarVM/unicode9: Before, the output produced depended on hash ordering, which meant
09:22 dalek MoarVM/unicode9: two runs on the exact same Unicode database could produce wildly
09:22 dalek MoarVM/unicode9: different output. This doesn't help when trying to debug.
09:22 dalek MoarVM/unicode9: review: https://github.com/MoarVM/MoarVM/commit/a94e44337c
09:22 dalek MoarVM/unicode9: fb29f18 | jnthn++ | src/strings/unicode_ (2 files):
09:22 dalek MoarVM/unicode9: Update to the Unicode 9 character database.
09:22 lizmat jnthn++  but /me wonders why that isn't a perl 6 script  :-)
09:22 dalek MoarVM/unicode9: review: https://github.com/MoarVM/MoarVM/commit/fb29f18aa5
09:22 jnthn lizmat: Because it wasn't feasible to make it one at the time it was written
09:22 nwc10 but 'patches soon welcome' (once the current problem is fixed) ?
09:23 lizmat jnthn: feels like LHF for someone else  :-)
09:23 jnthn heh
09:24 jnthn If 1600+ lines of Unicode database parsing, bit-field computation, generating C code, and so forth is appealing to anyone, then yes :P
09:26 lizmat well, at least now it's deterministic  :-)
09:26 lizmat so debugging it should be a piece of cake  :-)
09:26 jnthn Right, so the port should spit out the same thing :P
09:26 jnthn Which will help, yes.
09:27 stmuk_ that *BSD crash appears to happen on more bleading edge linux distros (tumbleweed & arc according to reports) although I still fail to reproduce on linux :(
09:28 jnthn Oh no, please say this isn't going to be that libc lock ellision hardware bug? :/
09:28 jnthn Well, maybe software bug
09:28 jnthn I forget the details
09:29 stmuk_ use of zsh seems common to reports (although many people use it anyway) and I played around with ulimits and zsh but nothing
09:32 stmuk_ or perhaps fs (ufs/brfs?) although I tried brfs and it seemed to work fine
09:35 jnthn https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=800574 is the issue I was thinking of, fwiw
09:36 stmuk_ eww
09:41 jnthn Yes, if it's that then very :( indeed
09:42 jnthn But I've no idea if this is the libc used on BSDs?
09:44 stmuk_ the BSDs use their own libc not glibc (its also related to the android one)
09:56 jnthn Hmm
09:56 jnthn So, emit_unicode_property_keypairs loads both property and property value aliases
09:57 jnthn The problem we have is that some property aliases and property value aliases overlap
09:59 lizmat so, is that a unicode problem?  or a perl scripting issue ?
10:00 jnthn Well, we seem to emit two tables
10:00 jnthn emit_unicode_property_keypairs does the one in question, and emit_unicode_property_value_keypairs does another
10:00 zakharyas joined #moarvm
10:00 jnthn It's unclear why we're paying attention to property *value* aliases in the former
10:03 jnthn Though not doing so fails us an incredible number of tests
10:26 jnthn m: say ' ' ~~ /<:space>/
10:26 camelia rakudo-moar c01fc3: OUTPUT«「 」␤»
10:26 jnthn m: say uniprop(' ', 'Space')
10:26 camelia rakudo-moar c01fc3: OUTPUT«SP␤»
10:26 jnthn m: say uniprop(' ', 'LineBreak')
10:26 camelia rakudo-moar c01fc3: OUTPUT«SP␤»
10:28 jnthn m: say uniprop(' ', 'space')
10:28 camelia rakudo-moar c01fc3: OUTPUT«SP␤»
10:29 jnthn m: say uniprop(' ', 'WSpace')
10:29 camelia rakudo-moar c01fc3: OUTPUT«1␤»
10:40 jnthn Gah, it gets worse
10:40 jnthn There are other collisions too
10:43 jnthn Heh, and at the top of PropertyValueAliases.txt it explains all of this
10:48 jnthn m: say uniprop("x", "AL")
10:48 camelia rakudo-moar c01fc3: OUTPUT«L␤»
10:49 jnthn m: say uniprop("x", "Arabic_Letter")
10:49 camelia rakudo-moar c01fc3: OUTPUT«L␤»
10:49 jnthn m: say uniprop("x", "Line_Break")
10:49 camelia rakudo-moar c01fc3: OUTPUT«BK␤»
10:50 jnthn m: say uniprop("x", "Bidi_Class")
10:50 camelia rakudo-moar c01fc3: OUTPUT«L␤»
10:51 jnthn m: say uniprop("&", "Bidi_Class")
10:51 camelia rakudo-moar c01fc3: OUTPUT«ON␤»
10:51 jnthn m: say uniprop("&", "Line_Break")
10:51 camelia rakudo-moar c01fc3: OUTPUT«BK␤»
10:53 jnthn m: say uniprop($_, "Line_Break") for ^64
10:53 camelia rakudo-moar c01fc3: OUTPUT«CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤BK␤LF␤BK␤BK␤CR␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤CM␤SP␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤BK␤B…»
10:57 timotimo bok bok bok bok bok goes the hen
10:57 timotimo QU
10:57 timotimo Ambiguous Quotation
10:58 timotimo
10:58 timotimo Quotation marks
10:58 timotimo
10:58 timotimo act like they are both opening and closing
10:58 timotimo mhhh, both opening and closing
10:58 timotimo that's fun :)
10:59 jnthn m: for ^0xFFFF { if uniprop($_, 'Gc') eq 'Pe' { .say; last } }
10:59 camelia rakudo-moar c01fc3: ( no output )
10:59 jnthn m: for ^0xFFFF { if uniprop($_) eq 'Pe' { .say; last } }
10:59 camelia rakudo-moar c01fc3: OUTPUT«41␤»
10:59 jnthn m: for ^0xFFFF { if uniprop($_, 'gc') eq 'Pe' { .say; last } }
10:59 camelia rakudo-moar c01fc3: OUTPUT«41␤»
10:59 jnthn m: for ^0xFFFF { if uniprop($_, 'jg') eq 'Pe' { .say; last } }
10:59 camelia rakudo-moar c01fc3: ( no output )
11:00 jnthn m: for ^0xFFFF { if uniprop($_, 'Joining_Group') eq 'Pe' { .say; last } }
11:00 camelia rakudo-moar c01fc3: ( no output )
11:01 jnthn m: say uniprop('x', 'Joining_Group')
11:01 camelia rakudo-moar c01fc3: OUTPUT«No_Joining_Group␤»
11:01 jnthn m: say ^0xFFFF .map({ uniprop($_, 'Joining_Group') }.uniq
11:01 camelia rakudo-moar c01fc3: OUTPUT«===SORRY!=== Error while compiling <tmp>␤Unable to parse expression in argument list; couldn't find final ')' ␤at <tmp>:1␤------> ap({ uniprop($_, 'Joining_Group') }.uniq⏏<EOL>␤»
11:01 jnthn m: say ^0xFFFF .map({ uniprop($_, 'Joining_Group') }).uniq
11:01 camelia rakudo-moar c01fc3: OUTPUT«No such method 'uniq' for invocant of type 'Seq'␤  in block <unit> at <tmp> line 1␤␤»
11:01 jnthn m: say ^0xFFFF .map({ uniprop($_, 'Joining_Group') }).unique
11:01 camelia rakudo-moar c01fc3: OUTPUT«(No_Joining_Group YEH ALEF WAW BEH TEH MARBUTA HAH DAL REH SEEN SAD TAH AIN GAF FARSI YEH FEH QAF KAF LAM MEEM NOON HEH SWASH KAF NYA KNOTTED HEH HEH GOAL TEH MARBUTA GOAL YEH WITH TAIL YEH BARREE ALAPH BETH GAMAL DALATH RISH HE SYRIAC WAW ZAIN HETH TETH Y…»
11:02 jnthn m: for ^0xFFFF { if uniprop($_, 'Joining_Group') eq 'PE' { .say; last } }
11:02 camelia rakudo-moar c01fc3: OUTPUT«1830␤»
11:02 nwc10 do any of these bugs summon Cthulu by accident?
11:02 jnthn m: say chr(41) ~~ /<:Pe>/
11:02 camelia rakudo-moar c01fc3: OUTPUT«「)」␤»
11:03 jnthn m: say chr(41) ~~ /<:pe>/
11:03 camelia rakudo-moar c01fc3: OUTPUT«「)」␤»
11:03 jnthn m: say chr(41) ~~ /<:PE>/
11:03 camelia rakudo-moar c01fc3: OUTPUT«Nil␤»
11:03 jnthn m: say chr(1830) ~~ /<:PE>/
11:03 camelia rakudo-moar c01fc3: OUTPUT«Nil␤»
11:03 jnthn m: say chr(1830) ~~ /<:Pe>/
11:03 camelia rakudo-moar c01fc3: OUTPUT«Nil␤»
11:06 jnthn The problem seems to boil down to "what should the <:Foo> form mean", because it appears to be ambiguous at the moment
11:07 jnthn At present, we take the property, and map it back to a Unicode property name
11:08 jnthn Ah, here's the bytecode:
11:08 jnthn 00074   const_s      loc_31_str, 'Foo'
11:08 jnthn 00075   unipropcode  loc_30_int, loc_31_str
11:08 jnthn 00076   unipvalcode  loc_29_int, loc_30_int, loc_31_str
11:08 jnthn 00077   hasuniprop   loc_28_int, loc_6_str, loc_7_int, loc_30_int, loc_29_int
11:09 jnthn So, we then use the property we mapped it to, to grab a Unicode property value code
11:10 jnthn We then ask if the string we're matching (6_str) and the given location (7_int) has that property/value
11:13 jnthn In the long form like :PropName<Value> there's no ambiguity
11:13 timotimo did they add more aliases so that we now have ambiguities?
11:13 timotimo and thus our test suite has become bogus?
11:14 jnthn It doesn't seem so
11:15 timotimo strange that it'd explode now, then :\
11:15 timotimo hey, we can constant-fold unipropcode and probably also unipvalcode!
11:15 jnthn Indeed
11:15 jnthn Well, in spesh at least
11:16 timotimo that'll help all those scripts that spend 50% of their cpu time inside unicode_db!
11:16 timotimo yes, otherwise we'd bind precompiled scripts to the unicode version they were compiled with
11:17 jnthn Thing is, if I reverse the order of the loop in generate_property_codes_by_names_aliases then I get more things busted
11:17 jnthn Meaning we were relying on, so far as I can tell, a completely aribitrary mechanism for resolving the conflcits
11:17 jnthn *conflicts
11:18 timotimo ugh, that's a really bad "decision" :)
11:24 jnthn Wow, I just put the Unicode 8 DB back in with my consistent output fix and it also fails that space test
11:25 timotimo o_O
11:25 * timotimo reconsiders bragging about moarvm's unicode support
11:26 jnthn Well, it boils down to a lang design issue too, I think
11:26 jnthn The what...even if I back out my changes to sort things and feed it the Unicode 8 DB it still fails the one test
11:27 timotimo not having the decision codified yet what a uniprop in a regex really means, yes?
11:28 jnthn Something like
11:28 * jnthn tries running ucd2c from master on the Unicode 8 DB
11:29 jnthn huh, that passes
11:34 jnthn Aha, but applying my "produce deterministic output" patch busts it
11:44 jnthn lunch &
11:44 nwc10 jnthn++
12:18 jnthn Back
12:18 jnthn OK, so
12:18 jnthn On master, with the Unicode 8 database, I did 3 runs
12:18 jnthn of ucd2c.pl, make -j install, run the spectest
12:19 jnthn first 2 times pased, third failed
12:19 jnthn So we had the problem all along
12:20 jnthn It's just that hash ordering sometimes hides it
12:20 jnthn For all the cases we have spectests for
12:23 jnthn Do it 3 times with the Unicode 9 DB, and I got pass, fail, pass
12:24 nwc10 but will it be fixed before ilmari returns from lunch? :-)
12:25 ilmari nwc10: I'd have to go for lunch first...
12:25 jnthn :P
12:25 jnthn Now trying this with 24742b1024
12:27 jnthn pass, fail, fail
12:27 jnthn So it can pass with the Unicode 9 DB + NFG tweaks
12:28 nwc10 ilmari: I think that you're giving him an unfair advantage
12:29 jnthn .oO( How many more re-gens until it passes again... :) )
12:29 jnthn bah, I shoulda stashed the passing one :P
12:30 * jnthn would kinda like to get a clarification from TimToady on what we'd like <:Foo> to do
12:30 jnthn S05 doesn't really make things much clearer
12:32 jnthn Hurrah, I have a passing one.
12:32 jnthn So in the meantime, we can have our Unicode 9 DB bump, more by luck than judgement
12:32 jnthn The situation doesn't get any worse
12:33 jnthn But urgh.
12:34 jnthn The other bad news is that NFG will need a more extensive bunch of changes to handle...yes, *emoji*.
12:35 nwc10 does Unicode actually have suitable joiners yet to express jumping the shark?
12:35 jnthn Since the grapheme boundary algorithm seems to have become something that needs more than just looking at 2 chars and being able to decide if there is a grapheme boundary between them
12:36 jnthn Wouldn't surprise me :P
12:36 jnthn m: say "\c[SHARK]"
12:36 camelia rakudo-moar c01fc3: OUTPUT«===SORRY!=== Error while compiling <tmp>␤Unrecognized character name SHARK␤at <tmp>:1␤------> say "\c[SHARK⏏]"␤»
12:36 jnthn jnthn@lviv:~/dev/rakudo$ ./perl6-m -e 'say "\c[SHARK]"'
12:36 jnthn ????
12:36 jnthn Unicode 9 does, however, add a shark :P
12:39 jnthn Seems there's no chars with JUMP in their name
12:41 dalek MoarVM: a19dc90 | jnthn++ | / (2 files):
12:41 dalek MoarVM: Explicitly handle ZWJ in grapheme break.
12:41 dalek MoarVM:
12:41 dalek MoarVM: This is in preparation for updating to the Unicode 9 UCD, which takes
12:41 dalek MoarVM: the Grapheme_Extend property away from Zero Width Joiner.
12:41 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/a19dc90074
12:41 dalek MoarVM: 9b39aa5 | jnthn++ | src/strings/unicode_ (2 files):
12:41 dalek MoarVM: Update to the Unicode 9 database.
12:41 dalek MoarVM:
12:41 dalek MoarVM: Note that while this gives us support for the various new chars in
12:41 dalek MoarVM: Unicode 9, we'll need updates to NFG to fully handle the emoji rules
12:41 dalek MoarVM: in grapheme break handling.
12:58 dalek MoarVM/deterministic-ucd2c: 224a261 | jnthn++ | tools/ucd2c.pl:
12:58 dalek MoarVM/deterministic-ucd2c: Tweak ucd2c.pl to produce deterministic output.
12:58 dalek MoarVM/deterministic-ucd2c:
12:58 dalek MoarVM/deterministic-ucd2c: Before, the output produced depended on hash ordering, which meant
12:58 dalek MoarVM/deterministic-ucd2c: two runs on the exact same Unicode database could produce wildly
12:58 dalek MoarVM/deterministic-ucd2c: different output. This doesn't help when trying to debug.
12:58 dalek MoarVM/deterministic-ucd2c: review: https://github.com/MoarVM/MoarVM/commit/224a261832
12:58 jnthn So, I'll leave that cleanup in a branch for now, until the lang design issue gets a decision/resolution.
12:59 ilmari jnthn: but there doesn't seem to be a water skier emoji
12:59 ilmari the closest is surfer, I guess
13:00 ilmari SPEEDBOAT+ZJW+SURFER+ZJW+SHARK
13:01 jnthn Nice :)
13:03 * ilmari is disappointed there's no WHOLE PIZZA, only SLICE OF PIZZA
13:03 domidumont joined #moarvm
13:03 ilmari and on that note, lunchtime
13:03 nwc10 \o/
13:04 ilmari m: say "\c[SLICE OF PIZZA]" xx 8
13:04 camelia rakudo-moar c01fc3: OUTPUT«(???? ???? ???? ???? ???? ???? ???? ????)␤»
13:04 ilmari although the slices I intend to eat are rectangular
13:05 nwc10 ZWJ is the anti-pizza-wheel?
13:05 ilmari pizza glue
13:07 nwc10 jnthn: `git describe` in MoarVM gives 2016.09-3-g9b39aa5
13:07 nwc10 your NQP bump wants 4
13:08 * nwc10 is surprised that Travis hasn't been on a drive-by yet
13:08 jnthn It did in the other channel
13:17 jnthn (And fixed)
13:17 jnthn I'll leave the NFG changes for another day.
13:19 brrt jnthn++
14:59 jnthn https://rt.perl.org/Ticket/Display.html?id=125978 is "intersting"
14:59 jnthn https://gist.github.com/jnthn/7d2cefda9d88151fb91ba5a62e6c7b4f is from valgrind
15:00 jnthn I'm not quite sure how this state can ever arise
15:34 jnthn Of course, it doesn't seem to show up anything like so easily with a debug build
15:42 jnthn Ah, got a related but different valgrind error out of it now
15:56 jnthn These are sorta suggestive that a thread is running while another is GCing, which would be bizzare.
15:56 jnthn But it's hard to see how:
15:56 jnthn if (ctx->arg_flags) {
15:56 jnthn /* Free the generated flags. */
15:56 jnthn MVM_free(ctx->arg_flags);
15:56 jnthn ctx->arg_flags = NULL;
15:56 jnthn Could lead to a double free otherwise :S
15:57 jnthn The traces clearly say the free in question ran
15:57 jnthn And the NULLing must happen right after it
16:04 Ven_ joined #moarvm
16:10 Ven_ joined #moarvm
16:15 * jnthn suspects he won't fully track this down today...
16:15 jnthn Time for some rest now, anyway.
17:16 FROGGS joined #moarvm
17:34 domidumont joined #moarvm
18:01 Ven_ joined #moarvm
18:21 Ven_ joined #moarvm
18:41 Ven_ joined #moarvm
19:01 Ven_ joined #moarvm
19:08 wrl joined #moarvm
19:09 wrl hey, i'm evaluating languages for embedding inside of a larger C app (ala blender). how's moar's embedding API these days?
19:12 timotimo there isn't a stable api for embedding
19:12 wrl gotcha. is that planned?
19:13 timotimo not sure
19:14 timotimo it's quite possible to just run a moar instance and communicate with the script running inside it via normal IPC things like sockets or pipes
19:14 wrl unfortunately the larger app relies on being able to share wrapped C data structure around
19:15 timotimo oh, i meant run the instance in your own program
19:15 timotimo then you can use NativeCall and friends to get at shared data structures
19:15 wrl ah right i see
19:16 wrl that seems like it could get very obtuse, unfortunately
19:19 timotimo wouldn't know until i tried it, tbh
19:21 Ven_ joined #moarvm
19:32 patrickz joined #moarvm
19:41 Ven_ joined #moarvm
20:00 Ven_ joined #moarvm
20:21 Ven_ joined #moarvm
20:40 Ven_ joined #moarvm
20:56 Ven_ joined #moarvm

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary