Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2017-01-05

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:10 timotimo good luck!
00:10 timotimo i've had a bit too much success with my last attempt at sleep :D
01:21 samcv sorry jnthn and lizmat
03:05 avarab joined #moarvm
03:05 avarab joined #moarvm
05:26 geekosaur joined #moarvm
06:27 domidumont joined #moarvm
06:33 domidumont joined #moarvm
06:54 domidumont joined #moarvm
07:12 domidumont joined #moarvm
08:07 brrt joined #moarvm
08:07 brrt good hi, #moarvm
08:07 samcv good hi!
08:08 nwc10 good *, #moarvm
08:08 brrt no worries, comments happen and stuff
08:09 brrt even jit compilers ultimately happen if you wait long enough
08:12 nwc10 but in that case are they Just In Time? Or rather late? :-)
08:12 nwc10 don't worry. I'm here all week, but tomorrow is a public holiday (at least Between Keyboard and Chair)
08:12 samcv it's always on time, because it's right in the name
08:12 samcv just in time
08:12 samcv you cannot dispute this
08:12 nwc10 aha
08:13 nwc10 therefore clearly good UGT everyone
08:13 samcv no it's good 'hi'
08:13 nwc10 er, it's also "good morning"
08:13 nwc10 I thought
08:13 nwc10 I'm confused
08:13 samcv no that's what brrt said
08:13 samcv <brrt> good hi, #moarvm
08:13 samcv <samcv> good hi!
08:14 nwc10 yes, I realised that much
08:14 nwc10 but clearly I failed to read it the first time
08:14 nwc10 is it bedtime yet?
08:14 nine .tell jnthn Mandatory reviews are not only good for QA but also help distributing knowledge around the team and grooming new developers. They might help MoarVM becoming less dependent on you.
08:14 yoleaux2 nine: I'll pass your message to jnthn.
08:14 samcv it's a little after midnight here
08:15 * brrt guesess samcv's timezone to be asiatic
08:15 brrt not australian, that'd be even later
08:16 samcv no i live in california
08:16 brrt then it … ought to be considerable past midnight?
08:16 brrt oh
08:16 brrt no, i'm wrong
08:16 samcv but new Zealand is 4  hours off from me, but on a different day
08:16 brrt my timezone mental calculations are off
08:17 arnsholt US eastern seaboard is CET-6, western is CET-9 =)
08:18 samcv -9... what
08:18 samcv it's -8
08:18 arnsholt CET, not GMT/UTC
08:18 brrt also known as 'the one true timezone'
08:18 brrt :-P
08:18 arnsholt Obviously =)
08:19 arnsholt (I *think* that's brrt's timezone as well, which is why I phrased it like that =)
08:19 nwc10 UGT±*
08:19 * brrt suggests introducing stardates
08:19 brrt by the way.
08:19 samcv how are stardates measured
08:19 brrt spilling is finicky
08:20 brrt with clocks
08:20 brrt :-P
08:20 samcv bad definition of time
08:21 zakharyas joined #moarvm
08:23 * brrt wonders if it's worthwhile to create a cons for a spill consisting of (live-range, spill-number)
08:27 brrt nm, i can just stash that in the live range object, and stash the spilled live ranges in an array
08:27 brrt not necessary to do that, but good for debugging
08:32 brrt ohai renamed masak
08:32 brrt with regards to macro's, i have written some truly fearful stuff now
08:40 arnsholt C preprocessor macros, or less insane macros? =)
08:49 brrt C preprocessor macro's of course
08:49 brrt my greatest source of pride is the __COMMA__ macro
09:08 arnsholt I assume it expands to something other than a comma? =)
09:09 * nwc10 hasn't even got halfway through https://news.ycombinator.com/item?id=13319904 yet
09:37 brrt it expands to a comma if defined as a comma, can also expand to something else
09:38 * jnthn yawns
09:38 yoleaux2 08:14Z <nine> jnthn: Mandatory reviews are not only good for QA but also help distributing knowledge around the team and grooming new developers. They might help MoarVM becoming less dependent on you.
09:38 brrt i defined it as a
09:38 brrt '|' to create a bitmap from enums at compile time
09:39 brrt nwc10: the other side is that clearly the python community are doing things right, if they can convince so many people to put so much effort into their language; and they're doing something wrong, because none of that ever goes anywhere / becomes mainstream
09:39 brrt how many alternative python implementations have we seen
09:40 nwc10 "viele"
09:40 nwc10 how many have stuck around?
09:40 nwc10 not "lots"
09:40 brrt pypy… barely
09:41 nwc10 I question why "barely" because I thought that it was doing OK, but I am an outsider looking in
09:41 brrt the project is doing fine, but uptake is still very low
09:41 nwc10 I do find grumpy interesting, because a comment of someone (I think I know who, but I don't want to attribute wrongly or leakily) was that Google stopped Unladen Swallow partly because their (other) investment in PyPy had delivered a lot of speed
09:42 nwc10 (The other reason I suspect was because concurrency wasn't going to work out in Unladen Swallow)
09:42 nwc10 anyway, yes, pypy still hanging in there
09:42 nwc10 dropbox not yet using Pyston in production
09:42 nwc10 (Clearly their goal is C API compatibility)
09:42 arnsholt nwc10: There's a link in that HN discussion to a YouTube video with Armin Ronacher talking about how CPython internals leak into Python user-space
09:42 arnsholt Very interesting
09:43 nwc10 whilst IronPython and Jython appear to have stuck around long enough for firms to use them
09:43 nwc10 and now cause problems beacuse they're on historical versions with now escape path
09:43 nwc10 arnsholt: thanks, not spotted that
09:43 nwc10 will find it and watch it
09:43 nwc10 (later, when not at work)
09:43 brrt not spotted either, will watch as well
09:44 * jnthn ran into Jython in the wild at $big-company, for whatever that data point is worth
09:44 nwc10 well, it means I can ask you
09:44 nwc10 1) so they're code is stuck on 2.5 compatibliity?
09:45 nwc10 2) what are they using now/next?
09:45 nwc10 oh, sorry, I forgot. jython 2.7.0 did ship. About 18 months ago
09:47 brrt the other, other hand
09:48 nwc10 a.k.a "The Gripping Hand" - context https://en.wikipedia.org/wiki/The_Mote_in_God's_Eye
09:50 jnthn nwc10: This was a couple of years ago, so I don't know what happened now/next.
09:50 jnthn Probably "nothing" though.
09:52 jnthn iirc they were using jython to get at Java libs
09:52 jnthn So no particularly easy escape
09:53 jnthn *sigh* So I got another night of LTA sleep.
09:54 * jnthn will attempt not to be too grumpy today...
09:54 nwc10 :-(
09:54 nwc10 I had those the night before last and the night before that, but last night was OK
09:54 nwc10 does this affect your coffee bootstrapping?
09:56 jnthn No, coffee was produced fine today
09:56 jnthn However
09:56 jnthn I normally drink a 5
09:57 jnthn At Christmas somebody sent me a collection on different coffees. And a few days ago I ran out of the one I normally drink so figured I'd crack on with the collection.
09:57 jnthn I didn't expect a 3 to seem so much weaker. I've no idea what I'll make of the 1s in the pack. :)
09:58 jnthn *a collection of
09:58 * brrt can agree with taht
09:58 brrt anyway: the wider issue is
09:58 nwc10 line them up like shots of spirits and just neck the entire row? :-)
09:58 brrt lots of people have the problem of 'python doesn't perform'
09:59 brrt which has a bunch of aspects; one of which is the global interpreter lock
09:59 brrt another aspect - which i think is even more relevant - is memory usage and startup times
10:00 brrt unfortunately, pypy helps with neither
10:00 brrt the thing is, it generalizes to ruby and perl as well
10:00 brrt lots of memory usage, lots of disk IO on startup, lot's of magic and intermediate layers
10:01 jnthn I expect different people have problems with different ones of those
10:02 nwc10 sorry to seemingly sound a bit glib, but "me too"
10:02 nwc10 and I am arm waving
10:02 nwc10 and I'm probably biased
10:02 brrt in my experience it's really, really common
10:02 brrt (PHP also has this problem, obviously)
10:02 jnthn (So the set of people for who the GIL is the blocker may be a different set from whom startup/memory is the blocker)
10:03 nwc10 but the thing I notice about language syntax or similar encancement requests is "I'd like X" and folks thinking "hence everyone would like X"
10:04 brrt i think both inhibit 'scale' in a way
10:05 brrt the effect of having a GIL is that if you want to do *anything* asynchronously, you need to resort to 'a solution' or 'a framework' or IPC or whatever
10:05 brrt you can't just start a new thread in your uWSGI handler, for instance, and think that things will be fine, because they won't
10:06 brrt the fact that you use so much memory limits the amount of process handlers you can have
10:06 brrt so now, we need autoscaling, and so we need a cloud, and so we need containers, and a distributed container management, and…
10:06 brrt i'm oversimplifying, of course
10:07 arnsholt nwc10: On gem from that talk: "Until pickle dies, we will never have versioned submodules" =)
10:07 arnsholt s/On/One
10:07 nwc10 does he explain why in the talk?
10:07 arnsholt Yeah
10:07 nwc10 good. then I won't ask here :-)
10:08 jnthn brrt: If you're scaling sufficiently you'll hit the need to go multi-node whatever the thing is written in, I suspect.
10:09 brrt hmmm
10:09 brrt i guess my point is
10:10 brrt google's move to go, at their scale, is the wisest thing they could've done, and patching up python to make it do what go can do has proven unreasonable in the past
10:10 nwc10 yes, that's a good summary (and I agree)
10:10 jnthn For sure, if you can run half the number of nodes then you save a lot at that scale.
10:11 nwc10 and also have spotted some comments somewhere in the discussions that I have read, that they don't use the dynamic features (that are expensive to implement) that they aren't putting in
10:21 brrt they do allow for introspection, though
11:02 * nwc10 reaches " Although Grumpy is compiled, it is just as dynamic as Python, in that method dispatch involves dictionary lookups, etc."
11:02 nwc10 which is interesting. It suggests that their use case is going to be "whatever we can do ahead of time"
11:06 samcv also jnthn it looks like all we need from the canonical combining class is to check whether it's 0 or whether one is greater than the other, and also whether they are equal. yey
11:06 samcv can get the int value for the enum since they are all in order always
11:12 jnthn Yeah, the non-zero values are primarily used in canonical ordering
11:12 jnthn Perhaps even exclusively, thinking about it.
11:12 samcv yeah
11:12 jnthn Or at least, if I used them elsewhere I forget
11:12 jnthn A little bit of history by the way
11:12 samcv well. also used in unicode_normalizer_process_codepoint_full
11:12 samcv which can probably be taken out (maybe)
11:13 jnthn But only for non-zeroness?
11:13 samcv hmm?
11:13 jnthn Originally we didn't use TR29's definition of grapheme-break at all
11:13 jnthn And just combined things with a non-zero CCC
11:13 jnthn (In a very early cut of the code)
11:13 samcv i believe the GCB property should be different for ones we care about
11:13 samcv but as i said. may not be true
11:14 samcv but i believe it is
11:14 jnthn So it's very possible that the uses of CCC outside of the canonical ordering can be replaced by soemthing else.
11:14 samcv so hopefully we can check the GCB proprety to decide what to do, and if it's not 0, er or whatever the default value is
11:14 samcv yeah
11:14 jnthn The main thing is that we still need to make sure we are getting to NFC form *and* doing NFG
11:15 samcv yep
11:15 jnthn NFG is kinda an NFC++ in that sense.
11:15 jnthn And NFC is formed from NFD
11:15 jnthn The code at present does a quite literal 3-step process on that.
11:15 samcv m: say 0x304.uniprop('GCB')
11:15 camelia rakudo-moar 8568dd: OUTPUT«Extend␤»
11:15 jnthn (Guarded by the quick check to avoid it in many cases)
11:15 samcv m: say 0x304.uniprop('Canonical_Combining_Class')
11:15 camelia rakudo-moar 8568dd: OUTPUT«230␤»
11:16 jnthn There are things that are extends with a CCC of 0
11:16 samcv do we care about those though?
11:16 jnthn Well
11:16 jnthn We do for forming NFG correctly, yes
11:16 samcv i think we care more about extends in that case
11:16 samcv than ccc being 0
11:17 jnthn That's why the NFG algo switched away from caring about CCC
11:17 samcv ah
11:17 samcv yeah. exactly my point
11:17 jnthn All I'm saying really is: be careful that in simplifying it, we don't somehow break forming NFC along the way.
11:17 samcv yeah. exactly
11:23 samcv we may even be able to remove NFG_QC and only do GCB
11:24 samcv but we will see as i start trying to simplify some things. that is much further down the list
11:24 jnthn That...doesn't sound likely
11:24 samcv since it's not too expensive
11:24 jnthn Singleton decompositions are the first counter-example that come to mind.
11:24 samcv well for some parts
11:25 samcv i mean we still need it. but for breaking up the graphemes i think we don't. for other things yes
11:25 samcv i.e. can call it further down the process codepoint full function
11:25 jnthn Ah, OK.
11:26 jnthn Yes, I don't doubt there's better/faster ways to achieve the smae thing at all ;)
11:26 jnthn *same
11:26 jnthn I'd rather like there to be. :-)
11:27 samcv we use nfg_qc on things that decompose and such right?
11:27 jnthn Yes
11:27 jnthn nfg_qc is nfc_qc
11:27 jnthn And then we zero more things
11:27 jnthn So it fully covers all the things NFC does
11:28 jnthn You added some I missed (or maybe new in Unicode 9) recently to that set of "more things", I think. :)
11:28 samcv yep
11:30 samcv jnthn, does anything that have extend or some weird GCB property have NFG_QC =No?
11:30 samcv i don't think they do
11:30 samcv well can't. it just breaks things in how we do things currently.
11:31 jnthn If they do then it'd be a bug, surely.
11:31 samcv yeah
11:32 samcv so maybe just rely on nfg_qc initially. and if it passes, ok fine, if it doesn't then look up the gcb and check decompostion and stuff
11:33 samcv and we can even look up the ccc just once and pass it to canonical_composition and canonical sort. so will not have to look it up again
11:34 jnthn *nod*
11:34 jnthn Though canonical sort waits until it has the things it needs to sort
11:34 jnthn But we can probably keep a parallel buffer of CCCs
11:34 samcv oh true
11:39 samcv oh actually looks like ccc is _not_ in order right now. but i made it in order. so that is nice
11:39 jnthn :)
11:40 samcv we need to borrow a really fast utf-8 decoding function though
11:40 jnthn When I implemented this stuff I was in many places quite literally translating Unicode spec into code that looked quite a lot like the spec. So if you're uncovering easy performance wins I'll hardly be surprised. :-)
11:41 jnthn I tought the one we had got was meant to be decent (in that we did actually borrow it, not write it ourselves :-))
11:41 jnthn *thought
11:41 jnthn But I'm +100 to replacing it with something faster (provided licensing stuff is OK)
11:41 nwc10 One insight from Perl 5 land (that others had and I thought "why didn't I think of it?")
11:41 samcv idk if the qt one could work
11:41 samcv it's simd optimized for a bunch of processors and arm as well
11:41 nwc10 is that you can support all the non-standard stuff
11:41 samcv since qt runs on phones
11:42 samcv and they have fast latin-1 and other format conversion as well
11:42 nwc10 just make sure that the fast one flags if it finds "bad" input
11:42 nwc10 and redo it with slow code that handles the exceptional stuff you want to do
11:42 nwc10 because the common case is (or should be) well formed input
11:43 samcv OK much better! now the nf* tests don't fail anymore now that the ccc is in order :)
11:43 samcv well when replacing with int based function
11:44 jnthn Nice :)
11:44 jnthn samcv++
11:44 jnthn nwc10: Yeah, that's reasonable. One slight challenge is in streaming decoding where you might not have the data to hand any more to look back at
11:46 nwc10 are there any cases that require more than 1 previous character to be held?
11:46 jnthn When decoding UTF-8? All the 2-3 byte sequences :)
11:46 nwc10 specific one I can think of is one valid low surrogate comes in, and it's not followed by a high surrogate
11:46 nwc10 yes. BUT
11:46 samcv define character
11:46 jnthn Oh, you're thinking ofthose
11:46 samcv codepoint?
11:46 nwc10 you can't let the start of the UTF-8 sequence go past
11:46 jnthn That too
11:47 nwc10 because you don't know that it's valid until you find enough continuation characters
11:47 jnthn Oh, I thought you were talking about tracking stuff to report line/column where the error in the input was.
11:47 nwc10 also, surrogates in UTF-8 aren't valid, IIRC, if you're being strict
11:47 nwc10 no, just at the code point level
11:47 jnthn Ah, OK. Then my comment makes less sense :)
12:11 arnsholt samcv: For answering questions like "are there characters with property A and property B at the same time" it might make sense to shove the whole shebang into an sqlite database, or similar?
12:11 arnsholt In fact, looks like someone's already put together tools to do that: https://github.com/Codepoints/unicodeinfo
12:16 samcv why sqlite database... that would be loads slower
12:16 samcv or you mean for my purposes?
12:19 samcv NFG_QC is an internal property not unicode spec tho
12:19 samcv i can just survey all the codepoints using perl6 and check things
12:28 samcv i'm going to bed. night night everybody
12:29 jnthn 'night, samcv++
12:29 samcv arnsholt, there's also this unicode.org/cldr/utility/properties.html
12:29 samcv which is *most* accurate
12:29 samcv mostly
12:29 samcv also haven't figured out how to add and subtract properties yet but i think there is a way
12:30 arnsholt Yeah, I meant for your purposes, not for MoarVM
12:30 arnsholt Because using SQL to get answers to questions is awesome (IMO, anyways =)
15:26 FROGGS joined #moarvm
16:33 brrt joined #moarvm
17:01 ggoebel joined #moarvm
18:09 timotimo i got another day of sleep %)
18:11 dogbert17 o/ timotimo, how are the hiccups today?
18:24 timotimo well, i haven't been up for long, but so far they haven't appeared
18:24 timotimo yesterday they were gone for most of my day-equivalent-timespan, but towards the time i wanted to go to sleep they came back
18:28 dogbert17 so, how much have you slept recently?
18:31 timotimo well, from yesterday to today and from the day before yesterday to yesterday it was ~10 hours each maybe? before that it was scattered attempts of few hours
18:42 dogbert17 10 hours, sounds like a record :-)
18:42 timotimo i'm not saying it's good :P
20:00 ggoebel joined #moarvm
23:45 samcv .ask jnthn so I am trying to get certain emoji working, and so i made this change here https://github.com/samcv/MoarVM/commit/a29f54d973a59668e81da1b05df3420d73f6c58d#diff-60b7c4ce4f16e0e46451e0610dbc21b3R579
23:45 yoleaux2 samcv: I'll pass your message to jnthn.
23:47 samcv .ask jnthn and oh oops i made a mistake. sorta nvm
23:47 yoleaux2 samcv: I'll pass your message to jnthn.
23:48 samcv nice. ok i fixed it 👍

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary