Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2015-04-20

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:51 colomon joined #moarvm
06:25 brrt joined #moarvm
06:42 brrt \o
06:48 FROGGS joined #moarvm
07:01 zakharyas joined #moarvm
07:01 Ven joined #moarvm
07:06 lizmat joined #moarvm
07:21 lizmat joined #moarvm
07:21 lizmat joined #moarvm
07:44 dalek MoarVM: a4f97a5 | jnthn++ | src/strings/nfg. (2 files):
07:44 dalek MoarVM: Start stashing NFG synthetics into a table.
07:44 dalek MoarVM:
07:44 dalek MoarVM: We don't put them into a trie yet or do lookups there, meaning that we
07:44 dalek MoarVM: currently always create a new synthetic all the time. This will do for
07:44 dalek MoarVM: initial round-trip tests.
07:44 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/a4f97a5368
07:54 dalek MoarVM: 9aaf34a | jnthn++ | src/strings/nfg. (2 files):
07:54 dalek MoarVM: Add function for looking up NFG synthetic info.
07:54 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/9aaf34a21a
07:54 dalek MoarVM: 3123cd5 | jnthn++ | src/ (2 files):
07:54 dalek MoarVM: Make code point iterator iterate synthetics.
07:54 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/3123cd5415
07:55 dalek MoarVM: 78f8c85 | jnthn++ | src/strings/normalize.c:
07:55 dalek MoarVM: Fix bug in grapheme composition algorithm.
07:55 dalek MoarVM:
07:55 dalek MoarVM: Forgot to slide codepoints in the buffer beyond those compressed into
07:55 dalek MoarVM: synthetics backwards.
07:55 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/78f8c851fd
08:19 dalek MoarVM: 6c8bd19 | jnthn++ | src/strings/normalize.c:
08:19 dalek MoarVM: Implement NFG string -> NFD/NFKC/NFKD codes.
08:19 dalek MoarVM:
08:19 dalek MoarVM: Probably not a highly likely path, but thankfully rather easy to do
08:19 dalek MoarVM: with the pieces we already have.
08:20 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/6c8bd19fe5
09:10 nwc10 is there a good way to dump the MoarVM bytecode to text, to be able to diff it?
09:12 FROGGS --dump?
09:14 nwc10 FROGGS: thanks. That looks like a good start.
09:14 FROGGS yes, if some information is missing we should add it
09:14 jnthn It includes most things.
09:15 nwc10 something, somewhere, somehow, seems to be behaving differently if the serialised size of string table offsets changes
09:15 nwc10 *other* than the thingy lazy that is cruel and unforgiving.
09:15 nwc10 and valgrind can't find it.
09:15 nwc10 nor can ASAN
09:16 nwc10 anyway, I'm back to work fun
09:16 nwc10 bad news BeOS users, you'll have to stick to apache 2.2, as 2.4 removed the relevant MPM
09:17 nwc10 OS/2 and Netware still OK: http://httpd.apache.org/docs/current/mpm.html
09:31 nwc10 jnthn: ASAN remains silent on master/master/nom
09:31 jnthn \o/
09:32 jnthn Checking that is the case is one of the reasons I bumped :)
09:53 Ven joined #moarvm
10:00 brrt joined #moarvm
10:07 donaldh joined #moarvm
10:40 nwc10 jnthn: you haven't broken Power64. Keep trying...
10:41 jnthn Working on it :)
10:48 rurban joined #moarvm
11:13 dalek joined #moarvm
11:18 colomon joined #moarvm
11:30 dalek MoarVM: fae0069 | jnthn++ | src/strings/nfg.c:
11:30 dalek MoarVM: Implement re-use of synthetic graphemes.
11:30 dalek MoarVM:
11:30 dalek MoarVM: This is needed to get string equality correct. We use a partially
11:30 dalek MoarVM: lock-free trie to achieve this. Having read the base of the trie,
11:30 dalek MoarVM: any thread is free to traverse it without having to acquire a lock
11:30 dalek MoarVM: (that is, reads are always lock free). Only additions need the lock,
11:30 dalek MoarVM: and it exists to serialize the additions. We always copy nodes that
11:30 dalek MoarVM: are modified, and never modify things in place; we schedule the memory
11:30 dalek MoarVM: to be freed at the next global safe point (at which point we know that
11:30 dalek MoarVM: no thread could possibly be reading any version of the trie).
11:30 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/fae0069445
11:32 jnthn Passes 500 test cases. :)
11:37 jnthn Next up: lunch. Then some of the fancier stuff that needs fixing for NFG to work out.
11:37 jnthn (case change, char classes, etc.)
12:00 brrt joined #moarvm
12:27 * jnthn back
12:29 nwc10 no-one broke anything while you were away.
12:39 brrt \o jnthn
12:42 jnthn o/ brrt
12:46 * brrt is very excited about the possibility of a long stretch of moarvm hacking
12:47 FROGGS I'd be excited if I'd know hot to tackle my (de)serialization bug...
12:48 FROGGS how*
12:48 nwc10 what is *your* deserialization bug?
12:49 FROGGS I changed rakudo to use the nqp::(de)serialize ops instead of using json... https://github.com/rakudo/rakudo/commit/52b1be70f1f52355f35239d94d99457a41421488
12:49 FROGGS I can bootstrap panda just fine, but as soon as I install another module, my serialized blob gets invalid
12:50 nwc10 OK, I have no idea about that sort of thing
12:51 FROGGS nwc10: but but... it is based on C (for some definition of based on)
12:51 FROGGS that is perhaps just one problem: https://gist.github.com/FROGGS/c6d637b32e4665ec3882
12:51 nwc10 you said "gets invalid"
12:51 nwc10 I don't know about that
12:57 rurban joined #moarvm
13:15 dalek MoarVM: 350212e | jnthn++ | src/strings/ (3 files):
13:15 dalek MoarVM: Basic implementation of case change with NFG.
13:15 dalek MoarVM:
13:15 dalek MoarVM: This should work out for most cases - but I fear there's somewhere in
13:15 dalek MoarVM: Unicode where something has a precomposed uppercase but there is no
13:15 dalek MoarVM: such precomposed lowercase, or vice versa. Those cases will wrongly
13:15 dalek MoarVM: produce a synthetic in one direction now.
13:15 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/350212ebe8
13:17 jnthn If anyone knows any such cases off the top of their head, I'd be interested to know 'em.
13:19 jnthn oh...
13:19 jnthn SpecialCasing.txt in the Unicode database has 'em.
13:20 FROGGS m: say "\c[LATIN CAPITAL LETTER SHARP S]"
13:20 camelia rakudo-moar 5cfddf: OUTPUT«ẞ␤»
13:20 FROGGS what does that produce as lowercase?
13:20 FROGGS though, that has nothing todo with composition :/
13:20 jnthn m: say "\c[LATIN CAPITAL LETTER SHARP S]".lc
13:20 camelia rakudo-moar 5cfddf: OUTPUT«ß␤»
13:20 jnthn m: say uniname ord "\c[LATIN CAPITAL LETTER SHARP S]".lc
13:20 camelia rakudo-moar 5cfddf: OUTPUT«LATIN SMALL LETTER SHARP S␤»
13:21 FROGGS m: say "\c[LATIN CAPITAL LETTER SHARP S]".lc.uc
13:21 camelia rakudo-moar 5cfddf: OUTPUT«ß␤»
13:21 jnthn m: say uniname ord "\c[LATIN CAPITAL LETTER SHARP S]".lc.uc
13:21 camelia rakudo-moar 5cfddf: OUTPUT«LATIN SMALL LETTER SHARP S␤»
13:21 jnthn That one should become SS by Unicode spec
13:21 [Coke] jnthn: there is a fudged esset test which hopefully you can now resolve. :)
13:21 jnthn [Coke]: esset?
13:21 FROGGS jnthn: I hope the unicode spec changes before we fix that :o)
13:21 FROGGS eszet
13:22 [Coke] ß
13:22 jnthn [Coke]: Well, I didn't implement the special casing rules yet
13:22 FROGGS m: say "SS" ~~ /:i ß/
13:22 camelia rakudo-moar 5cfddf: OUTPUT«Nil␤»
13:22 [Coke] 121377
13:22 jnthn Just realized that when I do, they interact with NFG.
13:22 FROGGS I hat the Eszet fwiw
13:23 FROGGS so, dont care for that, it is insane anyway
13:23 [Coke] jnthn: right, just letting you know there's unfudgeable stuff when you get there. :)
13:24 jnthn [Coke]: Aye, thanks. There's an RT also iirc :)
13:25 jnthn Here's the entry in SpecialCasing.txt for those curious:
13:25 jnthn # The German es-zed is special--the normal mapping is to SS.
13:25 jnthn # Note: the titlecase should never occur in practice. It is equal to titlecase(uppercase(<es-zed>))
13:25 jnthn 00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S
13:26 nwc10 greek final sigma...
13:26 jnthn Joy :)
13:26 jnthn And also in the same file
13:27 nwc10 anyway, took the Unicode consortium a long time to realise that they didn't have an answer for what the capturing groups should contain in (their idea for) "ß" =~ s/(s)(s)/i
13:27 nwc10 so don't assume that everything is even implementable.
13:27 nwc10 they are human.
13:27 jnthn What does Perl 5 do there, ooc?
13:28 nwc10 (and I have some inside knowledge that I'm not quoting anywhere logged about how some of them really are human)
13:28 FROGGS $ perl -E 'use utf8; say "SS" =~ /([ß])/i; say length $1'
13:28 FROGGS SS
13:28 FROGGS 2
13:28 nwc10 um, well, perl 5 always treated //i as just "lowercase"
13:28 nwc10 Unicode had wanted the "case insensitive" matching to be on case folded, not just lowercase
13:28 nwc10 ß case folds to "ss"
13:29 nwc10 (this is all from memory, so might be contradicted by [citation needed])
13:29 nwc10 so Perl 5 wasn't doing what they wanted things to move to (long term)
13:29 nwc10 it just turned out that because no-one anywhere had tried to implement that "long term"
13:29 nwc10 that when you added capturing groups to the mix
13:29 nwc10 (ie not just "does it match, Y/N?"
13:29 nwc10 but also "what bits matched")
13:30 nwc10 that the "bits matched" became a bit tricky to actually answer.
13:30 nwc10 at all.
13:32 * jnthn realizes we have string bitwise opertaions that will need careful handling with synthetics too :)
13:33 nwc10 um, do we have a definition for what the "stringwise not" of a code point is?
13:35 nwc10 eg, how many bits long is the logical not of "A"
13:36 FROGGS m: say ~^'A'
13:36 camelia rakudo-moar 5cfddf: OUTPUT«prefix:<~^> NYI␤  in block <unit> at /tmp/svYs6QVxTG:1␤␤»
13:36 FROGGS I guess we don't have to care about that either at this point... does anybody know what to expect?
13:37 * jnthn has no idea :)
13:37 FROGGS see
13:37 FROGGS so if there are no expectations and probably no use case...
14:09 dalek MoarVM: 633a11c | jnthn++ | src/strings/ops.c:
14:09 dalek MoarVM: Add a missing MVMROOT.
14:09 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/633a11ce8a
14:09 dalek MoarVM: e42476f | jnthn++ | src/strings/ops.c:
14:09 dalek MoarVM: Fix confusing whitespace.
14:09 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/e42476fce6
14:11 dalek MoarVM: b95028f | jnthn++ | src/strings/ops.c:
14:11 dalek MoarVM: cclass check on synthetic uses base codepoint
14:11 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/b95028f75d
14:11 dalek MoarVM: c8aaed9 | jnthn++ | src/strings/ops.c:
14:11 dalek MoarVM: Don't use cp as a variable name for a MVMGrapheme.
14:11 dalek MoarVM:
14:11 dalek MoarVM: Yes, this sin is replicated in various bits of the codebase and wants
14:11 dalek MoarVM: a clear up.
14:11 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/c8aaed9488
14:13 * jnthn typos MVMCodepoint as MVMCodepint, and wonders if it's nearly pub time yet. :)
14:16 dalek MoarVM: c2b818d | jnthn++ | src/strings/ops.c:
14:16 dalek MoarVM: Unicode property checks on synthetics use base.
14:16 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/c2b818da2e
14:20 nwc10 use more 'beer fridge'?
14:21 dalek MoarVM: 8bb5da8 | jnthn++ | src/strings/ops.c:
14:21 dalek MoarVM: Fix typos.
14:21 dalek MoarVM: review: https://github.com/MoarVM/MoarVM/commit/8bb5da80f2
14:33 nwc10 is pub time a reward, or a "step away from the keyboard"? :-)
14:34 jnthn Also a "have dinner" :)
14:34 jnthn It's not really time yet though.
14:35 FROGGS it is always a good time to eat something
14:35 jnthn Though that's probably enough NFG for today. Will take a keyboard break for a bit, and then look at some RTs. :)
14:37 jnthn I certainly feel like the core of NFG is there now; the rest is making places that aren't aware of it be so, and also asking TimToady hard questions about what we want in various cases :)
14:37 jnthn Oh, and plumbing it into I/O. :)
14:38 jnthn Well, I. It's already in O. :)
14:38 nwc10 everything, in O(1) time and space? :-)
14:38 jnthn Probably :P
14:39 jnthn anyway, bbi10
14:59 jnthn Darn, back up to 675 RTs...
15:00 * [Coke] is reminded to start converting misc todo's into RTs again.
15:03 nwc10 IIRC Perl 5 is steady state around 1300 to 1400, so Perl 6 is halfway respectable these days :-)
15:05 [Coke] m: "\c[LATIN CAPITAL LETTER A WITH DOT ABOVE, COMBINING DOT BELOW]".codes.say
15:05 camelia rakudo-moar 958ffb: OUTPUT«2␤»
15:06 jnthn [Coke]: We only NFG things explicilty constructed from a Uni so far.
15:07 jnthn Well, NF-whatever really
15:07 [Coke] jnthn: just autounfudging. that test is currently skipped.
15:07 jnthn Ah :)
15:07 jnthn Well, it's the right answer for .codes at least :)
15:07 jnthn Uh, I think so, anyway :)
15:17 btyler joined #moarvm
15:49 TimToady obviously they should've just added an 'ss' ligature instead of overloading ASCII
15:51 colomon joined #moarvm
17:05 nwc10 TimToady: I'm not convinced about that, but offhand I can't remember what the rules are for matching "ff" and similar ligatures
17:05 nwc10 in the "phone book" ordering, "ß" sorts as "ss"
17:05 nwc10 "Ä" as "AE" (etc)
17:07 vendethiel joined #moarvm
17:08 TimToady that's not matching, that's collation, which is also known to be insane :)
17:08 nwc10 yes, mmm "Ä" isn't expected to match "AE"
17:09 nwc10 so why is ß special snowflake? :-)
17:12 nwc10 me-- # did not prime the beer fridge
17:46 FROGGS joined #moarvm
18:09 nwc10 jnthn: EPIC non-fail. Try harder.
18:22 * TimToady wonders why the comment strings just repeat the hexcodes rather than showing the actual characters, which would be more interesting and informative, IHHO
18:37 japhb TimToady++ # Remembering to change even acronyms to match third-person emoting
18:57 brrt joined #moarvm
18:57 brrt \o
18:57 brrt i have a cunning plan
18:57 brrt to implement spesh-level tracing
18:59 brrt key ingredients - we don't need to trace all instructions ever, we can just trace entry of basic blocks
19:00 brrt we can do this by inserting trace logging statements at bb entry
19:02 brrt then we need to make sure that all callee's also have trace logging inserted
19:02 brrt preferably even when they've been speshed before
19:15 brrt the one tricky bit are invokish ops
19:21 brrt because their invocation is more or less hidden
19:29 AndChat|228864 joined #moarvm
19:32 JimmyZ_ brrt: I think we still need to trace every instruction to trace loops
19:32 brrt you don't
19:32 brrt that's the beauty of it
19:32 brrt because they are basic blocks
19:33 brrt you can't actually leave them :-)
19:33 brrt (within the block)
19:33 JimmyZ_ and consider some loop optimistion
19:33 brrt so if you enter the block, you *will* execute all of them
19:33 JimmyZ_ oh , you meant bb
19:33 brrt aye
19:34 JimmyZ_ not perl6 block...
19:34 brrt no indeed, a perl6 block is quite a larger construt
19:34 brrt construct
19:35 JimmyZ_ so luajit is tracing the loop ?
19:35 JimmyZ_ by some tags?
19:36 brrt i don't know how luajit does it
19:36 JimmyZ_ anyway, sleep time, 03:36 am here.
19:36 brrt sleep well
19:38 JimmyZ_ I was always thinking how luajit is doing, consider it has many advanced optimizitions .
19:38 JimmyZ_ good night.
19:53 lizmat joined #moarvm
19:58 jnthn brrt: Yeah, the block/invokish level tracing is what I'd had in mind.
19:59 brrt much cheaper than adding a check on every opcode i'd think
19:59 jnthn TimToady: (comment strings) 'cus my piece of crap terminal will likely copy-paste them wrong to my editor, slowing me down in debugging stuff, so largely they're optimized for me getting stuff done. :)
19:59 jnthn brrt: Indeed. :)
20:00 brrt tracing is such an awesome optimisation
20:00 brrt it's supercharged inlining
20:00 jnthn Aye, but like all optimizations, it's a trade-off. :)
20:01 brrt right. if your code is not actually tracy it's really costly to deopt all the time
20:02 jnthn nwc10: (try harder) I implemented a partially lock-free trie, that shoulda been advanced enough to screw up, dammit :P
20:15 nwc10 jnthn: would it make sense to generate 2 lines - one with comment strings in hex, and one with the real characters?
20:16 nwc10 I think that having the hex (or U+....) notation around makes a lot of sense, as it's unambiguous and really hard for text editors to screw up
20:16 jnthn nwc10: To me right now? Not really, the tests have served their purpose in getting me to something that seems to behave sanely, so I've not a huge incentive to spend more time on them. ;)
20:17 nwc10 ah. and TimToady has a commit bit? :-)
20:17 TimToady the parts that matter programmatically are already in hex
20:18 TimToady I don't care about those; I'm just curious what the actual character are :)
20:19 jnthn They're all just things with a non-zero Canonical_Combining_Class to me... :)
20:20 nwc10 jnthn: I think I have a livelocked moar again
20:20 nwc10 it's running t/spec/S17-procasync/basic.rakudo.moar
20:20 nwc10 it's here:
20:20 nwc10 #0  AO_load_acquire (addr=0x1000f420378) at 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/gcc/powerpc.h:91
20:20 nwc10 #1  0x00003fffa92f6818 in MVM_gc_enter_from_allocator (tc=0x1000f4206c0) at src/gc/orchestrate.c:378
20:21 nwc10 http://paste.scsys.co.uk/473116
20:22 nwc10 I know nothing about gdb and threads debugging
20:22 nwc10 but TFM suggests that `info threads` is a good start, and we have 6 threads
20:25 nwc10 jnthn: all 6 backtraces http://paste.scsys.co.uk/473122
20:26 jnthn nwc10: ooh, thanks :)
20:27 nwc10 I've ^Z ed the process, so it's not chewing CPU
20:27 nwc10 and awaits further probing
20:27 jnthn nwc10: I guess this is hard to reliably reproduce?
20:28 nwc10 seems to happen at random less than 10% of the time
20:30 [Coke] Wonder if this is related to the flapping S17 tests we see in the dailies.
20:31 jnthn Mebbe, though those don't deadlock
20:31 nwc10 backtrace is for MoarVM at 8bb5da80f2c072be49ffa6f75c2814a6b47dd381
20:32 nwc10 nqp at 53d43e830ecbf34a1dbf9f6b1597f0d57fa540a3
20:32 nwc10 rakudo at b62929991faecf1ae38fc5e0b6d2dd0a675d3187
20:33 nwc10 on gcc110 as described here https://gcc.gnu.org/wiki/CompileFarm
20:33 nwc10 gcc110      2TB    4x16x3.55 GHz IBM POWER7 / 64 GB RAM / IBM Power 730 Express server / Fedora 18 ppc64
20:33 nwc10 gcc112 is more bonkers still
20:34 nwc10 oh gosh, I'm asleep. gcc112 is also ppc64le
20:40 nwc10 oooh, it's funky enough that ccache won't build from source
20:42 nwc10 fedora-- # "perl" - you keep using that package name. I do not think that it means what you think that it means.
21:41 brrt left #moarvm
21:41 brrt joined #moarvm
22:58 vendethiel joined #moarvm
23:22 vendethiel joined #moarvm

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary