Perl 6 - the future is here, just unevenly distributed

IRC log for #parrot, 2014-07-17

Parrot | source cross referenced

| Channels | #parrot index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:26 FROGGS_ joined #parrot
05:39 bighugedog joined #parrot
07:04 cooper joined #parrot
08:46 basiliscos joined #parrot
09:45 Timbus joined #parrot
10:04 bighugedog joined #parrot
13:22 rurban1 joined #parrot
13:58 bluescreen joined #parrot
16:48 Chirag joined #parrot
17:29 autark joined #parrot
17:36 rurban1 joined #parrot
17:56 Hunger joined #parrot
20:52 robertle joined #parrot
20:54 robertle hi folks, I am currently trying to learn more about VMs, and have of course looked at parrot as well as lua as examples. I have written a very simple stack-based VM, and would like to try a register based one next as an exercise.
20:54 robertle now there is one thing I don't quite understand, and would be extremely happy if I could some input on
20:56 robertle a) if you e.g. compare lua with parrot, then you will notice that in lua the registers are type-tagged, so you can store any type in any register. in parrot, the basic registers are typed. I can see the case for typed registers: you do not need to go through different code paths depending on the content of a register, you know from the opcode what's in them. so less branching. at the same time, you need more space for teh
20:57 robertle I forgot what b) was about now :)
20:59 robertle ah, obviously I am targeting a dynamically typed language as well, otherwise that would be stupid of course
21:09 rurban imho lua's approach uses less space, so it's much faster in an intel cpu
21:10 rurban I also work on a lua-based vm: potion i.e. a lua with proper oo, traits
21:11 rurban the branching is not so important if the paths are short. the cpu both basic blocks
21:12 rurban the cpu fetches both
21:13 rurban with int it's a simple shift, with strings a mask, with double a full qword, with other objects a mask
21:13 robertle ah! I didn't know they can fetch both and then just skip some if required!
21:14 rurban lua/potion: no type specific ops, most ops are methods calls, which are cached
21:14 robertle I thought essentially every branch is a potential branch miss pipeline stall unless the rpedictor can avoid that
21:14 rurban no double arithmetic ops, all are method calls. arithmetic ops only for int, with a simple shift
21:15 rurban it only stalls if it has to wait. e.g. on fpu or io or far away heap mem.
21:16 robertle so type-tagged registers sound ok then! do you know what the reason for the fiexd-type registers in parrot is?
21:16 rurban the advantages for parrot are CPS: exceptions, concurrency
21:17 robertle hmm, I thought that a branch misprediction would stall it as well. need to read up on that
21:17 robertle how do these special registers help with concurrency or exceptions?
21:17 rurban and fixed type regs are also pretty fast. you just need a bunch of ops. as in java.
21:17 rurban the registers are stored in the continuation object
21:18 rurban no stack switching and stack adjustments necessary
21:18 robertle ok, but why does it make a difference for that whether they are of a fixed type or of a variable type?
21:18 rurban in lua/potion you have copy all stack values over manually with yield
21:18 rurban not really
21:19 rurban I prefer tagged types. but parrot went with fixed types
21:19 rurban no substantial differences
21:20 rurban fixed types allow full word ints, tagged only limited (as with a common lisp)
21:20 rurban so you cannot store pointeres in ints, and perl5 demands that
21:20 robertle because you need space for the tag?
21:20 rurban however, pointers are always aligned
21:20 rurban yes
21:20 rurban so lua uses the last 2 alignment bits for the type
21:21 rurban you can only store proper ptrs then, not pointers to the middle of some buffer/string.
21:21 robertle ok, this is an interesting topic: assuming you want full-length contents and a tagged type, the main problem is that because of alignment etc, your tagged union gets quite big, right?
21:21 rurban you cannot call it tagged then :)
21:21 rurban it's a normal 2 word value struct then
21:22 rurban lua/potion/common lisp is fast because it uses only one word
21:22 robertle right, with the second being a union of possible types
21:22 rurban perl/python/ruby are slow because their values are huge (3-10 words)
21:23 rurban with perl having a particular unfortunate layout
21:23 robertle so one of the things I would like to try is to basically still use a full type for the register, no lower-order bits as tags, but *not* keep them together quite so much.
21:24 rurban not keeping together? sounds bad
21:24 robertle so a 32-wide register file would be 32*word-sized register + 32*nibble to denote type of that word
21:24 robertle or so
21:25 rurban the type word pointing a type vtable? (as in common lisp)
21:25 robertle right, the obvious problem would be cache effects, so you need to keep this small enough that it does not matter
21:25 rurban you still need 1 bit or 2 for the GC
21:25 rurban or a full word for the refcount
21:26 rurban you can use the free bits in the type word for the gc. I guess ruby does it like this now
21:26 robertle but that's part of the value, right? I mean you would only GC complex things like string, which have to be stored somewhere else and the "word" in the register would just point to it
21:27 rurban yes, immediate values vs refs
21:27 robertle yeah, that sounds good. I was thinking a nibble per register for types, but I don't think I'll end up with 16 types...
21:27 rurban 16 is pretty low. I have 25 so far
21:27 rurban nope, 22
21:28 robertle so lets say in a primitve version you would have int, float, bool, nil, string, hash, vector, special
21:28 rurban https://github.com/perl11/p2/blob/p2/core/potion.h#L133
21:28 robertle first 4 being immediates
21:28 robertle I'll check it out!
21:29 rurban yes, 3-4 immediates + plus a couple of basic types, the rest can fit into a generic container with known size and a vtable pointer
21:30 rurban fperrad also has a luajit based vm, with perl6 support in the works.
21:30 rurban https://github.com/fperrad/tvmjit/tree/master/src
21:30 robertle one thing that I need to work out, and that I would be very glad to get ideas on, is how to structure the instruction set around the dynamic types
21:31 rurban the compiler knows about immediates, the rest is done via method calls
21:31 robertle initiall I thought that many of the ops would behave differently based on the types in the reisters, with many of the options raising an exception
21:32 rurban but some arith ops need to check the types at run-time, for undef, bigints, doubles, strings and such
21:32 robertle so e.g. a "multiply" would allow int and float in the input regs and store and int or float in the target, raise an exception on all other types
21:33 robertle right, that's what I mean! so the question is do I *always* do runtime type check/branch, or do I write an instruction set which only works if the correct types are in the registers, and make the compiler guarantee that
21:33 rurban yes. see e.g. https://github.com/perl11/p2/blob/p2/core/vm.c for the simple lua-alike version ()simplier than lua in fact)
21:33 robertle the second sounds like less runtime overhead, but you need more opcodes
21:33 robertle e.g. you need a separate opcode for float mult and int mult
21:34 robertle with e.g. a byteode format like lua's, you don't get that many...
21:34 rurban you can add a type inferencer (hm style) or gradual typing to restrict types to help the compiler
21:34 rurban generally you have slower ops for the generic case
21:35 robertle still: do I look at the type of the input registers when executing an opcode to determine what to do, or just at the opcode itself
21:35 robertle right, so an option would be to generally look at the input registers, which is slow but powerfull, and have a few fast ops that get executed often.
21:35 robertle int++ or so
21:36 rurban with arith binops I look at both types, and if both are immed. ints, do it fast (one cpu op), otherwise delegate to a method call
21:37 rurban https://github.com/perl11/p2/blob/p2/core/vm.c#L355
21:37 rurban with parrot it's similar, we just have seperate ops for int, double and objects
21:37 rurban and the objects (integer, double, big, complex, ...) do their slow logic
21:38 robertle right, when you say "I look at both types", you mean at opcode dispatch time?
21:38 rurban yes. the biggest trick is to be able to call methods fast, not so the ops
21:39 rurban opcode dispatch is either in a byteocde vm (big switch) or if jitted it's already in the instruction cache linearily layed out
21:40 rurban with parrot we look at 3-6 arg types I guess. if it's a multi method call
21:40 rurban but the delegation is very slow then
21:41 rurban it was much faster a few years back though, before multi dispatch support
21:43 robertle what I was trying to do get the next instruction (which looks like lua in structure), mask off the opcode, determine the source registers and copy the relevant nibbles over. so I end up with one word that contains both opcode and source types
21:43 robertle and the do a computed goto with that
21:44 robertle obviously there would be plenty of cases that do the same, whcih is just another goto
21:45 robertle if I see it right this is the "big computed goto" option, the other one to just do ti on the opcode, and then do something similar inside each op handler to switch on the operand types
21:45 robertle not sure what's better, nprobably needs experimentation
21:48 robertle I need to look at your p2 in more detail, but since it seems to execute lua bytecode, what's the core difference
21:48 robertle ?
21:48 robertle I mean, obviously there is going to be lots of other bits like GC and IO
21:48 robertle or concurrency
21:48 robertle but I mean in the actual heart of the VM, the dispatcher?
21:54 robertle anyway, thanks a lot, I'll do some more playing and will be back. off to bed now...
21:54 rurban robertle: yes, that's ertl's idea for fast vm's. ops + args in one word
21:54 rurban lua does it the same way
21:55 rurban p2 is basically an improved lua, much simplier, pre-luajit
21:55 rurban with proper oo system
21:56 rurban http://www.jilp.org/vol5/v5paper12.pdf
21:57 robertle eh, when you say "ops + args in one word", do you mean lua-style bytecode, or the "big computed goto"
21:58 rurban lua-style bytecode
21:58 rurban big computed goto, is the slow unjitted dispatch
21:58 rurban the fast dispatch is the jit
21:59 rurban See http://perl11.org/p2/ for the potion/p2 docs
21:59 robertle right, but JIT aside the question is still whether you combine opcode and arg types into one big jump table, or do a computed goto based on opcode alone, and then something siomilar to sort out different behaviour based on the types
21:59 rurban or better http://perl11.org/p2/html/
22:00 robertle ah, that explains a bit! I was confused looking at your code, expecting a 3-address machine like lua is, simply beacuse the opccodes are so similar. but this is 2-address?
22:01 rurban no, its the same 3 address machine
22:02 robertle dest = op src what
22:02 robertle what is an operand as well?
22:02 robertle i mean, src and what are operands?
22:02 rurban op + 2 operands, a bigger one and a smaller one
22:03 robertle k
22:03 robertle argh, I *reallY* have to go to bed
22:03 robertle thanks!
22:03 rurban bye!
22:04 rurban actually both operands use 12 bits: https://github.com/perl11/potion/blob/master/core%2Fopcodes.h#L20
23:29 dalek joined #parrot

| Channels | #parrot index | Today | | Search | Google Search | Plain-Text | summary

Parrot | source cross referenced