Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2015-06-05

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:45 koo6 joined #marpa
01:05 jeffreykegler joined #marpa
01:24 jeffreykegler There's a FAIL from CPANtesters for Marpa::R2 -- looks like it is not Marpa's fault, but I thought I'd post some links ...
01:24 jeffreykegler for reference, and in case others have some insights.
01:25 jeffreykegler First, the report
01:25 jeffreykegler http://www.cpantesters.org/cpan/report​/5731b98a-0ae2-11e5-a85a-e28fcaadd3a7
01:25 jeffreykegler A very helpful thread on perlmonks: http://www.perlmonks.org/?node_id=845132
01:26 jeffreykegler And it refers to this: http://peknet.com/blog/pro​jects/swish/test-failures
01:27 jeffreykegler Libmarpa has been very stable, and it's a safe guess that any bugs are from elsewhere, but it is not a 100% thing.
01:29 jeffreykegler Libmarpa is 15,000 lines of pointer-twiddling mathematical code, and any programmer who assures you he is 100% sure there are no errors in something like that ...
01:29 jeffreykegler is either fooling himself or trying to fool you.
01:51 sadmac jeffreykegler: I parsed a simple grammar and it was ok. I also analyzed the actual target grammar. We predict up to 200 states in a given position, and it appears to be steady, not increasing as we go along.
01:52 jeffreykegler good
01:52 sadmac so yeah, my prediction is just bushy for this grammar. I'm rewriting around different data structures.
01:53 jeffreykegler Predictions can be implemented as a bit vector of rules, by the way.
01:53 sadmac we allow non-deterministic and zero-length tokens, so I'd used some higher-level data structures previously
01:54 sadmac (and non-deterministic text streams if you consider error correction)
01:54 jeffreykegler Every prediction has the same origin (= the current earley set) and the same dot position (position 0), so the only thing which varies is the rule.
01:54 sadmac right
01:55 jeffreykegler Btw, the Aycock-Horspool states also were excellent at memoizing predictions -- though in practice not much good at anything else
01:56 sadmac jeffreykegler: I cache prediction sets for each symbol already. I was debating making those cached results recursive? so for symbol B we'd store a predicted set of "B->C,C->D,D->E", so sort of a disintegratable Aycock-Horspool
01:56 jeffreykegler After experimenting with all three of the mentioned optimizations, my current theory is to just given each prediction an Earley item.
01:57 jeffreykegler The overhead will always be less than a constant, and as technology progresses will decrease.
01:57 sadmac right, we do that. we just cache the rules lists.
01:57 sadmac fair enough
01:58 jeffreykegler Any optimization of prediction makes them non-orthogonal with other Earley items and winds up costing in later phases.
01:59 jeffreykegler s/of predicition/of predictions/
02:00 sadmac jeffreykegler: the idea was to store and remember Aycock-Horspool states, but convert them to normal earley states as we put them in to the state set.
02:00 jeffreykegler If I understand you correctly, that is how Marpa original worked.
02:01 jeffreykegler and it proved to be just too much trouble -- over-optimization.
02:01 sadmac fair enough
02:01 jeffreykegler Have you done Leo yet?
02:01 jeffreykegler And remind me, what language are you working in?
02:01 sadmac jeffreykegler: no. I analyzed my parse though, and we do indeed avoid all right recursion.
02:01 sadmac jeffreykegler: Ceylon
02:02 jeffreykegler You have a specific grammar in mind (my apologies, I think I remember us discussing this ...
02:02 jeffreykegler but the details escape my memory.)
02:02 sadmac jeffreykegler: I'm self-hosting the language, so Ceylon is both my implementation language and my target.
02:03 jeffreykegler And you are not trying to create a general-purpose general BNF parser then, just a Ceylon parser.
02:04 sadmac jeffreykegler: I am trying to create a general purpose parser. I am both self-hosting the language and adding parsing to its standard library.
02:04 jeffreykegler In which case, right recursion will probably be needed someday.
02:05 sadmac jeffreykegler: the idea is to just transform right recursion in to kleene star (which I think ends up being exactly like leo optimization)
02:05 sadmac jeffreykegler: so we turn 'A->aA' into 'lassoc: A->a*A'
02:06 sadmac (where the effect of 'lassoc' is effectively to forbid predicting this rule from its own rightmost symbol)
02:06 jeffreykegler The grammar will not in general have the same structure, but ...
02:07 jeffreykegler not doing Leo is a design decision, one which you can make.
02:08 sadmac I'm not married to my choice. I'll see what happens when the bug reports start comming in :)
02:08 jeffreykegler Earley's without fast RR will still be so much better than what folks are used to, there won't be bug reports. :-)
02:11 jeffreykegler Please keep us informed of your progress!
02:11 sadmac jeffreykegler: of course
02:11 sadmac jeffreykegler: https://github.com/sadmac7000/ceylon.parse
02:45 jeffreykegler rns: an impressive amount of work is in MarpaX::Languages::Lua::AST
03:06 ronsavage joined #marpa
06:22 ronsavage joined #marpa
12:23 lucs joined #marpa
18:04 lwa joined #marpa
23:25 ronsavage joined #marpa

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary