Perl 6 - the future is here, just unevenly distributed

IRC log for #moarvm, 2017-01-17

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:09 Ven joined #moarvm
00:49 Ven joined #moarvm
01:09 Ven joined #moarvm
01:12 vendethiel joined #moarvm
01:28 Ven joined #moarvm
01:48 Ven joined #moarvm
02:08 Ven joined #moarvm
02:29 Ven joined #moarvm
02:46 Ven joined #moarvm
02:48 ilbot3 joined #moarvm
02:48 Topic for #moarvm is now https://github.com/moarvm/moarvm | IRC logs at  http://irclog.perlgeek.de/moarvm/today
03:04 Ven joined #moarvm
03:24 Ven joined #moarvm
03:26 lizmat joined #moarvm
03:45 Ven joined #moarvm
04:04 Ven joined #moarvm
04:24 Ven joined #moarvm
04:44 Ven joined #moarvm
05:04 Ven joined #moarvm
05:24 Ven joined #moarvm
05:44 Ven joined #moarvm
06:04 Ven joined #moarvm
06:24 Ven joined #moarvm
06:50 Ven joined #moarvm
07:02 brrt joined #moarvm
07:13 domidumont joined #moarvm
07:19 domidumont joined #moarvm
07:21 Ven joined #moarvm
07:33 geekosaur joined #moarvm
08:11 nwc10 good *, jnthn
08:12 nwc10 jnthn: subtle attempt to get me to create a coffee slick - table football players send a ball ricocheting round the kitchen floor under me, to distract me
08:13 nwc10 and I feel that I'll use that as a feed line (er, drink) to spam the channel with the information that we're trying to recruit coffee drinking table football players: http://unternehmen.geizhals.at/about/de/jobs/
08:13 nwc10 (says so in the job ads)
08:17 brrt good *, nwc10
08:18 arnsholt I'm decent at table football (for a Norwegian, anyways), but I've already got a job (and not that keen on moving to Vienna =)
08:18 arnsholt I'm very good at drinking coffee, though
08:19 brrt i suck at table football and at table tennis and most table sports
08:19 nwc10 I suck too (at all these things)
08:19 brrt caffeine intake tolerance is reasonable
08:19 nwc10 I think that they tolerate me because I drink some coffee and can do Perl OK.
08:21 brrt nwc10: you're situated in vienna, then?
08:21 brrt do they speak german all the time?
08:21 nwc10 no. because I slack at tech stuff in German
08:21 nwc10 and "for some value of" German
08:21 nwc10 (they have different words for some things, which they insist they are correct)
08:22 nwc10 and the best bit - "so that word that I don't know, is it German German, Austrian German, or something anyone outside of Vienna would look at me funny for?"
08:22 nwc10 "don't know"
08:22 nwc10 (sometimes)
08:22 arnsholt =D
08:23 nwc10 Vienna is intersting, particularly coming from London and having lived in Cambridge
08:23 nwc10 it's small enough to feel like Cambridge
08:24 nwc10 but I think I can just see (a corner of) the OPEC HQ (if I look down a particular gap from exactly the right place in the office)
08:24 nwc10 and it's the UN's number 3 city (after NY and Geneva)
08:24 nwc10 so it's not even just "a capital city"
08:24 nwc10 it's got some pretentions above that
08:25 brrt is vienna expensive to live in?
08:25 nwc10 not by London standards :-)
08:26 nwc10 (this is hard to answer)
08:26 brrt i guess, yeah.
08:26 brrt comparing 'costs of living' is really difficult accross countries
08:27 nwc10 I forget where you'd need to go to find the numbers for the "Big Mac" index
08:27 nwc10 but more useful things like the ratio of "median propery cost" to "median salary"
08:27 nwc10 or other stuff that tries to figure out how hard you need to run just to stand still
08:27 nwc10 no idea
08:35 zakharyas joined #moarvm
08:37 Ven joined #moarvm
08:50 arnsholt nwc10: Big Mac index is The Economist, IIRC
08:51 nwc10 I can tell you that McDonalds here sells beer and takes Amex
08:51 nwc10 which partially makes up for the fact that it's McDonalds :-)
08:54 Ven joined #moarvm
09:04 brrt hehehe
09:08 arnsholt Just like Denmark!
09:08 samcv idk if you guys saw what i said in #perl6 but, we can save 1/3 of the size of the unicode name database if we compress as base 40
09:08 arnsholt (Beer at McD's, that is)
09:09 samcv which is pretty nice. saves 250KB out of a total of 787KB
09:09 arnsholt Nice!
09:10 arnsholt Since your fiddling with that kind of idea: Have you tried Huffman coding it too?
09:10 samcv nope
09:10 samcv i mean the strings are short. so. idk
09:11 samcv though i guess the whole thing could be compressed together but, still
09:11 nwc10 runtime readonly access (from memory shared between processes) usually for the (most) win
09:12 samcv also one thing that is annoying is that it takes about as much space to store the point indexes compared to the actual unicode data
09:12 samcv so I need to find a way to somehow compress that
09:13 samcv i mean maybe I could have a bunch of smaller structs? and one for each set of codepoints?
09:13 arnsholt Point indexes?
09:13 samcv so I would have to store much lower numbers and could use narrower int's. actually that's a fantastic idea. have been thinking what to do about that for a while
09:14 samcv that map each point to a column in the unicode data struct
09:14 arnsholt Aha
09:14 samcv well it depends how many we end up having tbh
09:14 Ven joined #moarvm
09:15 samcv it could be a lot and it gets expensive if it goes over the length of a short
09:15 samcv well even short's are too big
09:15 samcv there's so many points
09:16 samcv either way, splitting up points with an offset so i can use the narrowest value will save a lot of space
09:17 nwc10 I'm not familiar enough with the data to know if this is a daft suggestion, but IIRC Unicode does tend to do things in 256 codepoint blocks
09:17 nwc10 so is there a saving if you do some sort of "long" pointer based on the block, and then a shorter offset table for each of the 256 code points in the block, and add them together?
09:18 domidumont joined #moarvm
09:18 samcv that is sort of kinda truish. but the blocks are mostly irrelevant because they basically automatically sort themself. because no row in the bitmap is identical
09:18 samcv so storing indexes to that bitmap is the bigger issue
09:19 samcv since there are 0xE01EF codepoints, well technically there are higher, but that's the highest named codepoint
09:20 samcv so that fits into a 20bit integer, but if we did that for all codepoints :P
09:22 samcv right now we _sort_ of do that
09:22 lizmat but we're basically talking about ascending integer values ?
09:22 samcv we still use up a 16bit integer * 52102
09:22 samcv even though we do apply offsets for ranges of codepoints
09:22 samcv yeah lizmat
09:23 samcv so i'm thinking of just splitting it so i only have to store a short instead
09:24 samcv and one of the reason we ONLY store 52,000 is because CJK ideographs and stuff the name is derived from its properties
09:24 samcv well. that's not exactly why but.
09:25 samcv but atm it's a huge if else tree
09:26 lizmat I was reminded of some search engine internals I was involved with 14+ years ago
09:26 samcv but yeah it does look like it does it by plane I think
09:27 lizmat it was able to encode an offset with about .5 bit in the end
09:27 lizmat (on average)
09:27 samcv nice
09:28 samcv atm i think it's by block or something, it's divided up and i *think* each divided up section is deduplicated, but not the whole thing
09:29 Ven joined #moarvm
09:37 jnthn morning o/
09:37 samcv morning jnthn
09:38 domidumont joined #moarvm
09:44 samcv also i'm guessing let's say I have a char *things[100] = { NULL, NULL.... }; it's going to take up the space of how many pointers? i mean it depends on how it stores where the pointers are to the pointers. becuase it has to store where the pointers are somewhere
09:44 jnthn (nwc10 job add) I can say that Vienna seems really nice; when I was planning a move back to central Europe a couple of years back, it was on my shortlist of options. :)
09:44 jnthn *ad
09:45 samcv if anybody knows. i'm guessing obv you can't depend on anything, but worst case an array of half null pointers, the null pointers could cost the size of 2 poniters for every NULL?
09:47 jnthn samcv: If you just have something like `static Foo *bar = { baz, NULL, wat, NULL };` you mean?
09:47 samcv yep
09:47 jnthn Pretty certain there's no compression of any kind on that
09:47 jnthn NULL will take as much as any other pointer in the array
09:47 samcv well i know AT LEAST it takes up the size of a pointer
09:48 samcv but somewhere it has pointers to point to the NULL pointers
09:48 jnthn Since arrays are accesed by multiplying the element size by the index
09:48 samcv err or any pointer
09:48 samcv ah ok. yeah
09:48 jnthn So the storage of an array is elems * sizeof(elem_type)
09:49 samcv kk, so all the pointers are all contiguous
09:49 jnthn Yeah, a C array will be contiguous in virtual memory :)
09:49 samcv so 1 pointer + the size of an arrays worth of pointers
09:49 jnthn *nod*
09:58 samcv so if i compress the strings, instead of a NULL pointer for something with no name having 8 bytes, it will be stored in 2 bytes instead :)
09:58 samcv in addition to the 1/3 size savings
10:07 dogbert17_ joined #moarvm
10:08 Ven joined #moarvm
10:16 brrt joined #moarvm
10:28 Ven joined #moarvm
10:48 Ven joined #moarvm
11:08 Ven joined #moarvm
11:23 Ven joined #moarvm
11:25 timotimo if you have a whole lot of prefixes like that "LATIN LOWER CASE LETTER", you can have a little table of that
11:25 timotimo however
11:25 timotimo if you can't just pass around a pointer into the big table of strings
11:25 timotimo you have to malloc and free
11:25 timotimo which ... ugh
11:27 timotimo i imagine that problem would also exist if you use base40 for our strings
11:28 timotimo hmm. but most of the time we're already creating a VMString from those things
11:29 timotimo yeah, my worries are entirely unfounded. cool!
11:44 brrt joined #moarvm
11:45 zakharyas joined #moarvm
11:52 Ven joined #moarvm
11:59 samcv hmm for some reason storing the strings in base 40 is not any smaller. at least compiled size. it must have a way of storing a char * [1000] more efficiently than many short arrays
12:00 samcv not sure how to do it without pointers though. and having one array with all the pointers to the short arrays caused the file to be ridiculous.
12:00 samcv :\
12:00 samcv *file size
12:00 jnthn samcv: Which file are you checking the size of?
12:00 samcv unicode names file
12:00 jnthn Yes, I meant compiled output.
12:01 samcv as a char *unicode_names[2000] or whatever, compared to a bunch of unsigned short unicode_name_xx []
12:01 samcv yeah i'm talking about compiled
12:01 jnthn Yes, which compiled file did you look at?
12:01 samcv one I made?. all the file has in it is unicode names
12:01 jnthn Ah, OK
12:01 jnthn I thought maybe you'd built it into moar already
12:01 samcv that is the only thing in it. and I even removed all NULL and empty values
12:01 jnthn the size of moar woudln't change
12:02 samcv heh
12:02 jnthn but libmoar.so would :)
12:02 timotimo how do you store the base 40 thing? C doesn't support 40 bits per array element, so you'll have to do things manually with bit masks and shifts if they are to be stored tightly
12:02 samcv timotimo, well
12:02 samcv this is my script https://github.com/samcv/UCD/blob/master/lib/EncodeBase40.pm6
12:02 samcv you can store 3 characters inside two short's
12:03 samcv err
12:03 samcv 3 characters inside 1 short
12:03 timotimo oh, base 40 is not 40 bits, duh
12:03 samcv yeah
12:03 samcv you can even do different case if you want to get fancy
12:03 timotimo it's a bit late in the day to still be asleep inside your brain
12:04 samcv and use one of the extra characters as a shift
12:04 samcv but c is compiling it to much bigger, but it should really be 1/3 the size in raw data
12:05 timotimo you're actually spelling it "short"?
12:05 timotimo i'm not sure if short is the same length everywhere
12:05 samcv possible
12:05 timotimo i'd go extra-sure and use MVMint8 or whatever
12:06 timotimo do you know of dwarfdump? i've used it in the past a few times to get the actual size of things, but i'm not sure how well it deals with arrays
12:06 samcv i have not used it before
12:07 samcv i mean it must be storing extra pointers or things to the arrays or whatever idk how else it would be the same final size, well actually 10% bigger
12:07 samcv and that's not making an array of pointers to these arrays
12:07 timotimo just dwarfdump path/to/libmoar.so and go through a pager. it's a firehose of info, but searching for identifiers from the code can get you where you need to be
12:07 Ven joined #moarvm
12:07 samcv well i'm not compiling it into moar yet
12:07 timotimo can you show me a diff or something?
12:07 samcv just checking on it by itself
12:08 samcv stand by
12:08 samcv https://gist.github.com/ad1a2161645c2ac3b6282996245d8e7e here is names.str.c
12:09 timotimo ah, yes, it's a big'un
12:09 samcv ye that's the string one
12:09 samcv uploading the base 40 encoded one now
12:09 samcv https://gist.github.com/dd924ae9336bfb160589ec956ded79ea here is that one
12:09 samcv i am storing the number of base 40 numbers as the 1st element
12:10 timotimo ah, indeed
12:10 samcv but that should be smaller than a 64bit pointer, and i'd think that it should be smaller
12:10 samcv since with the char * it has to store the pointers + the strings, but it must be packing the data differently? idk
12:10 samcv i've only checked compiled size
12:10 timotimo it could be trying to ensure aligning
12:10 timotimo you could hexdump the file to see if you can spot lots and lots of null bytes
12:10 samcv yeah
12:11 samcv maybe it's not aligning the strings but is aligning the arrays hmm
12:11 timotimo since we don't care a lot about aligned reads, you could make one big table with all the shorts in it and then assign &bigtable[offset] to all the uniname_* thingies
12:11 samcv yeah
12:11 timotimo give the C compiler less rope to hang us with
12:11 samcv ^
12:12 samcv one big table sounds like it would work
12:12 timotimo you might be able to get the size of the C source to be a bit smaller by using 0x (if the number in decimal is longer than the number in hex)
12:13 samcv though how do i figure out where to go in this table for a specific point. i'm guessing the number of chars of everything is pretty long. wait actually i already computed this h/o
12:13 samcv ok 272267 16 bit unsigned integers
12:13 samcv is all the names
12:13 samcv but then i'd have to store the index inside the big table to access them
12:14 timotimo maybe we only need a linear scan or prefix sum once when we write the code out to the .c file?
12:14 timotimo i.e. not actually store it, just compute it that one time from the lengths of string
12:15 timotimo worst case we can go through the big table and jump from one entry to the next just because at the start it has the length already written in the table
12:15 timotimo so we read 2, so we kip ahead 3, we read 6, we skip ahead 7, we read 5, we skip ahead 6, etc etc
12:15 samcv i am not familiar with linear scan
12:16 samcv hm
12:16 timotimo oh, i mean just go through all elements with a for loop
12:16 samcv yeah i mean when we want a name lookup we load a hash table anyway
12:16 timotimo oh, right, we do
12:16 samcv so could just start from 0, and store number of 16bit ints after, if we see a 0 then the char has no name
12:17 samcv if it's not 0 then load the name
12:18 timotimo ah, we only need the list of starting points for the hash table anyway?
12:19 samcv starting? well i would think we'd start at 0
12:19 samcv and then possibly skip certain ranges that are long enough to matter
12:19 timotimo er, i mean, the location where each name starts
12:19 timotimo like, do we need one array that gives us codepoint to position in table?
12:20 samcv well if we just go through the structure once and load the hash we don't care where it starts
12:20 samcv no
12:20 samcv well. i *think* we can access the name from the cp with the hash, but you would know better than me
12:20 timotimo only if we ever use the codepoint itself as a hash key
12:20 samcv atm i think we lookup the name in the array of char *'s for looking up by cp and when looking up by name use the hash
12:20 samcv hm
12:21 timotimo sounds like we do need that
12:21 samcv we might already i'm not sure
12:21 samcv idk at least we supply it to the macro in two places
12:21 samcv i really don't know though
12:22 timotimo our new array of codepoint-to-name might look like unsigned short *cp_to_name = {&bigtable[0], &bigtable[4], &bigtable[18], &bigtable[42], &bigtable[1337], ...}
12:22 timotimo might want a macro for &bigtable[N] there ...
12:22 samcv so store it in multiple big tables?
12:23 samcv are we going to be able to find the index in the data structure directly or have to scan through it and load it?
12:23 timotimo we definitely can create the indices while generating the .c file
12:23 samcv imo it should be fine if we need to load a hash for it, because using \c[whatever] already has to load it
12:23 samcv but there's so many codepoints
12:23 timotimo and we should. the less stuff we have to initialize at startup time, the better
12:23 samcv ok
12:24 samcv so similar to how i was thinking of splitting up the indexes from cp to the bitfield?
12:24 timotimo i don't remember how you were going to do that, but ... probably?
12:24 samcv so that we don't have to store really big numbers inside it and can use narrower types?
12:24 samcv yeah
12:24 samcv was going to split it up into as many would fit into a 16bit uint
12:25 timotimo you mean so that the index fits into 16bit?
12:25 samcv which would half the number of bytes needed to store the index. because it ends up being much more than the data needed to store the property values
12:25 samcv yeah
12:25 timotimo OK. if the index always fits into a whole number, that's good
12:25 samcv we will have like 5k-10k bitfield rows, and then like
12:25 samcv huge number of codepoints
12:26 timotimo because some codepoints share bitfield rows?
12:26 samcv err wait what am i thinking about. 10k will fit into a 16bit fine, uhm
12:26 samcv most do
12:27 samcv if you dedup it properly
12:27 timotimo content-addressed storage :D
12:29 samcv but yeah the names take up MUCH more space than all the other content
12:29 samcv even without doing all these optimizations like splitting things up
12:30 timotimo oh twitter
12:30 samcv ?
12:30 timotimo someone tweets #perl6: how to use ... and Daily Tech Issues also tweets that exact thing
12:31 timotimo no matter, just some random tangent
12:33 samcv btw here is the C code that will convert from the base 40 numbers https://github.com/samcv/UCD/blob/master/base40decode.c
12:34 samcv and it's nice because we can extend it later and add more letters
12:34 timotimo mhm, that looks simple enough
12:34 samcv that \n' there should prolly be a '-', but. we can remove the \0 ones and have one value be a shift
12:34 samcv and if it sees that character in front of another it will change case or access another character
12:34 samcv whatever we want really
12:34 timotimo right, we basically do utf8 :)
12:35 samcv hmm?
12:35 timotimo not important
12:35 samcv oh lol
12:35 timotimo tbh, it's not like utf8 at all
12:35 timotimo ok, so what's the current state of generating the .c from our list of names?
12:35 samcv in moarvm now?
12:35 samcv or in my repo
12:36 timotimo whatever's newest with our ideas and experiments
12:36 samcv oh. well the base 40 is what we should try to go for, because 1/3 reduction in size
12:36 samcv and figure out some way to get the data into some way that won't waste space
12:37 timotimo right
12:37 timotimo do we have code to stash all our base40 values into one big table yet?
12:37 timotimo and generate a second table that has a pointer into the big table for every codepoint?
12:38 timotimo so we can just get_chars(bigtable[codepoint], buf) or something?
12:38 samcv yeah i do
12:38 timotimo cool. but that still doesn't give us a small .o file?
12:38 samcv nope. it gets to be like 90MB
12:39 samcv if i remove the table of pointers though, it becomes like 5% more than just an array of char *'s
12:40 timotimo oh
12:40 samcv but you can checkout the repo i have. and run UCD-download.p6, then run perl6 ./UCD-gen.p6 --less=1000 or something
12:40 timotimo don't forget if you have uniname_1, uniname_2, ... it'll also generate one entry in the symbol table for each of those
12:41 samcv and it will generate a file in ./build/names.c
12:41 samcv https://gist.github.com/ad811b58480561061a686fc69ded1e73
12:41 samcv this is it with --less=2000
12:41 samcv err i did 100. but yeah
12:43 samcv you can run 'make' to compile both names and bitfield.c
12:43 samcv bitfield.c, if you run the compiled file, bitfield. it will work fine
12:43 samcv print out the property values and chars for at least like non control characters up to 100 or something
12:43 samcv using the grapheme cluster break to figure out whether to print the character verbatim otherwise just show U+
12:44 timotimo did you hexdump the resulting file when you compile names.c?
12:44 timotimo it has a section in it that's just:
12:44 timotimo 00002a10: 756e 696e 616d 655f 3000 756e 696e 616d  uniname_0.uninam
12:44 timotimo 00002a20: 655f 3200 756e 696e 616d 655f 3400 756e  e_2.uniname_4.un
12:44 timotimo 00002a30: 696e 616d 655f 3600 756e 696e 616d 655f  iname_6.uniname_
12:44 timotimo 00002a40: 3430 0075 6e69 6e61 6d65 5f34 3200 756e  40.uniname_42.un
12:44 timotimo 00002a50: 696e 616d 655f 3434 005f 4954 4d5f 6465  iname_44._ITM_de
12:44 timotimo 00002a60: 7265 6769 7374 6572 544d 436c 6f6e 6554  registerTMCloneT
12:44 timotimo 00002a70: 6162 6c65 0075 6e69 6e61 6d65 5f31 3800  able.uniname_18.
12:44 samcv heh
12:44 timotimo 00002a80: 756e 696e 616d 655f 3132 0075 6e69 6e61  uniname_12.unina
12:44 timotimo 00002a90: 6d65 5f31 3400 756e 696e 616d 655f 3539  me_14.uniname_59
12:44 timotimo 00002aa0: 0075 6e69 6e61 6d65 5f35 3700 756e 696e  .uniname_57.unin
12:44 samcv no wonder it uses so much space
12:44 timotimo 00002ab0: 616d 655f 3136 0075 6e69 6e61 6d65 5f35  ame_16.uniname_5
12:44 timotimo 00002ac0: 3100 756e 696e 616d 655f 3130 0075 6e69  1.uniname_10.uni
12:44 timotimo i expect that's where your overhead comes from
12:45 samcv well then proof it must be smaller!
12:45 samcv well the underlying data :P
12:46 jnthn o.O
12:46 jnthn Is that debug data, or a linking table, or?
12:46 timotimo the code was:
12:46 timotimo unsigned short uniname_32[3] = {2,31041,5000};
12:46 timotimo unsigned short uniname_33[7] = {6,8963,19253,2409,24597,20858,17600};
12:46 timotimo unsigned short uniname_34[6] = {5,28055,32060,15014,59721,29240};
12:46 timotimo unsigned short uniname_35[5] = {4,23253,3418,59969,11760};
12:46 timotimo so yeah, linking data
12:46 timotimo it might go away if we put "static" in front?
12:47 ilmari const too?
12:47 timotimo doesn't
12:47 ilmari are you looking at the actual code/text segments, or debug info
12:48 timotimo 1651889 16K -rwxr-xr-x. 1 timo timo 14K Jan 17 13:43 names*
12:48 timotimo 1669101 16K -rwxr-xr-x. 1 timo timo 16K Jan 17 13:47 staticnames*
12:48 timotimo this is without any flags, so shouldn't have -g, right?
12:48 timotimo -O3 doesn't make it better
12:48 timotimo OK, strip makes it go down to 8.3K
12:48 ilmari use size, not ls
12:49 samcv from how big?
12:49 samcv which file are you testing on?
12:49 ilmari timotimo: how about const?
12:49 samcv this is the 100 line file?
12:49 timotimo i added const, it made ti bigger
12:49 samcv err 100 name file
12:49 samcv heh
12:49 timotimo samcv: i took your last gist with names.c in it
12:49 samcv this ? https://gist.github.com/samcv/ad811b58480561061a686fc69ded1e73
12:49 samcv kk
12:50 timotimo precisely
12:51 samcv ok i'm going to generate 2000 names. that may be better for comparison
12:51 timotimo OK
12:51 samcv and closer to real life
12:51 samcv 100 is a little small
12:51 timotimo well, as close as unicode gets to real life :P
12:52 ilmari making it static consts takes it from 8992 to 7648
12:52 samcv i go from 198 to 125K if i strip this
12:52 samcv <samcv> 100 is a little small
12:52 ilmari text   data    bss    dec    hexfilename
12:52 ilmari 2503   1408      8   3919    f4fconstnames
12:52 ilmari 1115   2752     72   3939    f63names
12:52 ilmari 1115   2752     72   3939    f63staticnames
12:52 samcv https://gist.github.com/7c95f29c5f89460f5bfe2ddd72c3e689
12:52 samcv err here it is
12:53 ilmari note how it moves from data to text, so it'll be mapped shared between processes
12:53 timotimo that is desirable
12:54 ilmari text   data    bss    dec    hexfilename
12:54 ilmari 55983  16608      8  72599  11b97constnames
12:54 ilmari 1115  71264    256  72635  11bbbnames
12:54 ilmari 1115  71264    256  72635  11bbbstaticnames
12:54 ilmari the 2k-name one
12:55 samcv all three of those are 2k name ones?
12:55 ilmari yes
12:55 samcv so we want static but not const?
12:56 ilmari we want both, if they're actually constant
12:56 samcv even if the size is bigger?
12:57 ilmari the total size is 200 bytes bigger, but the actual data is moved from data (unshared) to text (shared)
12:57 samcv ah ok
12:57 ilmari so you'll save 55k per process
12:57 samcv 200 bytes is not much
12:58 samcv nice
12:58 nwc10 200 bytes should be enough for ilmari's lunch :-)
12:59 samcv i get this warning though  initialization discards ‘const’ qualifier from pointer target type
12:59 timotimo right
12:59 timotimo you need to put a const after the *
12:59 samcv yeah. just noticed that
13:00 samcv damn you search and replace
13:00 timotimo also, the file still contains all the uniname_* strings :(
13:00 ilmari timotimo: that's just the debug info
13:00 timotimo not when stripped, though
13:00 timotimo 'k
13:00 ilmari which doesn't actually get mapped at runtie
13:00 ilmari s/tie/time/
13:00 timotimo x-wing vs runtime fighter
13:01 ilmari https://randomascii.wordpress.com/2017/01/08/add-a-const-here-delete-a-const-there/
13:02 timotimo neat
13:09 Ven joined #moarvm
13:09 timotimo we could totally get sizes of moar and libmoar.so from statisfiable
13:10 timotimo though ... we'd probably want per-moar-commit rather than per-rakudo-commit resolution there
13:16 samcv that would be cool
13:17 timotimo i asked in #whateverable
13:26 Ven joined #moarvm
13:28 ilmari nwc10: 🌯 time!
13:37 ilmari[m] joined #moarvm
13:45 timotimo hm, nqp-m -e '' takes about 14000 maxresidentk, a reduction of 250k would almost be noticable :3
13:45 timotimo but with rakudo ... you'd hardly feel it at all :(
13:48 samcv timotimo, how do I go from a unsigned short *, and iterate over the values?
13:49 samcv do I have to do bitwise operations to do that?
13:49 samcv i have never tried to do this before... never needed to. just used normal ints
13:53 ggoebel joined #moarvm
14:13 timotimo depends
14:13 timotimo if you want to use pointer arithmetic, i.e. p++, or if you can deal with an index into the thing
14:14 timotimo i don't actually know what your current use case is, so i can't advise well
14:15 samcv storing pointers to the ararys in another array
14:15 samcv for the unicode names
14:15 timotimo ah
14:15 timotimo i'd spell that &otherarray[index]
14:15 timotimo like, as constant
14:23 timotimo actually, you could even store indices into the otherarray in an array and do an extra indirection when looking up stuff
14:23 samcv what do you mean otherarray. you mean the array which contains pointers to the arrays?
14:23 timotimo um
14:23 samcv the data array or the indices array?
14:24 timotimo i was still thinking of when i suggested to have all data in one gigantic array
14:24 samcv yeah. i think that would be decent
14:24 samcv I should see how big that would end up
14:52 Ven joined #moarvm
15:04 zakharyas joined #moarvm
15:23 Ven joined #moarvm
15:39 timotimo i'm listening :P
16:02 brrt joined #moarvm
16:18 brrt joined #moarvm
16:22 zakharyas joined #moarvm
16:28 colomon joined #moarvm
16:30 brrt joined #moarvm
16:38 Ven joined #moarvm
16:53 Ven joined #moarvm
18:08 Ven joined #moarvm
18:14 Geth joined #moarvm
18:17 zakharyas joined #moarvm
18:24 domidumont joined #moarvm
18:34 domidumont joined #moarvm
18:50 FROGGS joined #moarvm
19:09 nine What do you people use for profiling moar?
19:10 timotimo perf for rough ideas of what's going on (with -g) and callgrind if i need more details
19:13 timotimo aaw, perf c2c will be available starting with linux 4.10, but i'm still getting 4.9 kernels
19:45 nine Is "MVM_CALLSTACK_REGIION_SIZE" really correct or is the double I a typo?
19:46 timotimo it's being used consistently at least :)
19:47 Geth MoarVM: 21d7d6e603 | (Stefan Seifert)++ | 2 files
19:47 Geth MoarVM: Fix typo in MVM_CALLSTACK_REGIONS_SIZE's name
19:47 Geth MoarVM:
19:47 Geth MoarVM: Hopefully reduces confusion and distraction for the next one to dig into this code
19:47 Geth MoarVM: review: https://github.com/MoarVM/MoarVM/commit/21d7d6e603
19:48 nine So, where were I?
19:50 timotimo were you singing death, death, death, death, devil, devil, evil, evil songs?
20:00 domidumont joined #moarvm
20:10 Ven joined #moarvm
20:26 Ven joined #moarvm
20:28 jnthn Heh, I can to read the thing three times to spot the doubled letter :P
20:28 jnthn *I had
20:41 Ven joined #moarvm
20:55 Ven joined #moarvm
21:03 zakharyas joined #moarvm
21:40 Ven joined #moarvm
21:55 Ven joined #moarvm
22:40 cygx joined #moarvm
22:42 cygx so there's some weirdness happening when trying to build dyncall (and dyncallback specifically) with the latest 32-bit version of Strawberry Perl
22:42 yoleaux2 30 Dec 2016 21:41Z <samcv> cygx: i will look at it as soon as I wake up again. if i misinterpreted the bug
22:42 cygx for some reason I've yet to figure out, make things the C files should be compiled with C++
22:43 cygx consequently, it will first fail to not find the dyncall header (as CFLAGS will not be picked up), and it will finally fail to link due to C++ name mangling
22:45 cygx good night o/

| Channels | #moarvm index | Today | | Search | Google Search | Plain-Text | summary