Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2014-10-15

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:07 jeffreykegler jdurand: Ran and tested OK.
01:08 jeffreykegler The core wrapper code is in src/MarpaWarpper.c?  How does SWIG come in?
01:11 jeffreykegler jdurand: re http://irclog.perlgeek.de/​marpa/2014-10-13#i_9502457 -- so is next a concrete example?
01:15 jeffreykegler MarpaWrapper.c looks like it represents a lot of work!
03:10 ronsavage joined #marpa
03:15 CQ_ joined #marpa
04:57 jdurand joined #marpa
04:58 jdurand Re http://irclog.perlgeek.de/​marpa/2014-10-15#i_9510255 - it is a architecture choice
04:58 jdurand do you want to have lua users access to libmarpa directly and/or only to marpaWrapper
04:59 jdurand Both are possible. Aslike the proof of concept with libmarpa, there will be a need to have a SWIG interface for marpaWrapper
04:59 jdurand the concrete example can be done with both interfaces
05:19 mlilenium_ joined #marpa
05:19 mlilenium_ left #marpa
05:29 jdurand FYI http://realmensch.org/blog/fun-lua-bindings is interesting - it appears that SWIG is natively only a one-way binding - well, adding the other way is not that hard, jus a bit painful
05:54 lwa joined #marpa
06:12 jeffreykegler jdurand: re http://irclog.perlgeek.de/​marpa/2014-10-15#i_9510850.  Both!  And I'd love to see the concrete interface done both ways.
06:16 jeffreykegler Marpa::R2's THIF was intended to be a direct interface to Libmarpa, with only the absolutely necessary "value added".
06:16 jeffreykegler I take it that the approach in MarpaWrapper.c comes out of your XML work?
06:17 jdurand joined #marpa
06:17 jdurand Re http://irclog.perlgeek.de/​marpa/2014-10-15#i_9510991 - ok
06:18 jdurand Re http://irclog.perlgeek.de/​marpa/2014-10-15#i_9510997 - yes -  and that's why I know it works -;
06:19 jdurand Have to go at $work - AFK
06:19 jeffreykegler jdurand: re http://irclog.perlgeek.de/​marpa/2014-10-15#i_9510908 -- OMG.  So many choices.
06:20 jdurand ... many choices, but almost of them have pros and cons - SWIG is actively maintained, and in the opensource world, depending on a long-live-dynamic thing is a reason for the choice...
06:21 jeffreykegler jdurand: I am certainly happy to see us stick to SWIG.
08:16 lwa joined #marpa
13:39 lwa joined #marpa
13:55 rns joined #marpa
14:01 rns jeffreykeler: re http://irclog.perlgeek.de/​marpa/2014-10-14#i_9510009 — Glad you find it useful. I tried to use metalua for testing by diff'ing ASTs of lua sources formatted by Lua::Ast vs. the originals. Worked fine for the start.
14:03 rns re: Marpa::R2::ASF (ASF) and Marpa::R2::Glade (Glade) — I have a test case with an ambiguous grammar and *highly* (over 2 million ASTs delivered by $slr->value()) ambiguous input.
14:06 rns $asf->traverse( &full_traverser ) tries to enumerate all ASTs that just takes much time.
14:07 rns The low-level code from Marpa::R2::Glade doc quickly print its ambiguity report.
14:24 rns I take it that both do the right thing and just work as documented.
14:25 rns My concerns, however: max_parses and factoring_max (I know it's undocumented but still) have no effect on $asf->traverse().
14:31 rns The test case and output are here https://gist.github.com/rns/cb9f19989565309de914
14:32 rns BTW, this was a first cut of a solution to stackoverflow question, so It's kind of real. :) I made the grammar unambiguous for the final version.
14:33 rns Another concern is that ambiguous() output looks strange — only one choice.
14:34 rns So it's another bug or feature type — take a look when you have time.
15:18 jeffreykegler joined #marpa
15:18 jeffreykegler rns: re traversing ASF's
15:20 jeffreykegler ASF traversal was not intended to speed up a full traversal of all ambiguous parses -- it's purpose was to give the user a way to prune the tree sensibly as they go.
15:21 jeffreykegler The hope was that this allows the application to traverse only those parses they are interested in.
15:21 jeffreykegler I know we test and show examples of full traversals, but that's because they make sense for testing ...
15:23 jeffreykegler and because they make a good clear example.  But it's a little like the tiny calculator examples and tests -- why write a calculator that small using Marpa,
15:23 jeffreykegler when you've got shell arithmetic, Perl one-liners, etc., etc.?
15:25 rns Makes sense.
15:25 jeffreykegler max_parses is not implemented, because it doesn't really fit well with the ASF logic.  Again, the intent is that the application is pruning the tree, so if it does want some sort of arbitrary, counted cut-off, it can keep a counter and just do it.
15:25 jeffreykegler For my motivation, you might want to glance back at the Ada blog post,
15:26 jeffreykegler where I started a top down traversal of an English sentence ...
15:26 jeffreykegler my idea was that the application would select "most likely parses" as it went.
15:27 jeffreykegler My ASF code came out of that, and trying to figure out a way to helpfully traverse the tree, so that it could be pruned.
15:28 jeffreykegler In NLP-like ambiguities, there's an exponential number of possibilities, and simple enumeration is exponential, and that's a theoretical minimum.
15:28 rns Parsing Ada Lovelace's quote, yes, I remember reading it.
15:29 jeffreykegler I was going to follow up on that, turn it into a series, but reaction was so poor, I gave up on the idea and just worked quietly and alone on my ideas.
15:30 jeffreykegler But that was the problem that motivated the ASF's.
15:30 jeffreykegler Anyway, I've gone on for a bit, but bottom line ...
15:31 rns which I'm a big fan of, I have to say.
15:31 jeffreykegler ASF's and the Glade interface are to enable pruning, if you just want to enumerate ...
15:31 jeffreykegler the older interface is probably faster, not that it matters if you go exponential.
15:32 rns Thanks for explaining it's certainly clearer now.
15:33 jeffreykegler Perhaps I should, when I revisit the doc, make clear that the full traversal are *not* really the way to use that interface, they're just good as examples ...
15:34 jeffreykegler and as starting points.
15:35 jeffreykegler rns: have you been able to look at Jean-Damien's new work on the LUIF?
15:37 rns Yes and no, I cloned the repo at its very start a couple of days before, but it didn't compile under cygwin ....
15:38 rns genericLogger.c I didn't tried very hard though, thought it'll take more shape.
15:39 rns It seems it did. Will try again in a couple of days once I sort it out at $work.
15:39 jeffreykegler Great!
15:39 rns re ASF/Glade, to sum up ...
15:40 rns what worries me, 2 points: (1) $slr->ambuguous() report that looks incomplete
15:40 rns Choices start with: +Team:US
15:40 rns Choice 1, length=415, ends
15:40 rns — that's all
15:41 jeffreykegler Oh yes right, forgot that point. :-)
15:41 rns : should I file a github issue?
15:41 jeffreykegler The number of ambiguities, especially with factorings, grows exponentially and quickly ...
15:42 jeffreykegler ambiguous() is intended as an error message, so I keep it short, and try to keep it just enough to indicate the nature of the problem, and its location ...
15:43 jeffreykegler A full enumeration of the ambiguities often would be huge, and if you're simply treating ambiguity as an error unhelpful.
15:43 jeffreykegler Re github issue, probably best to discuss it here, for the moment.
15:44 jeffreykegler Because right now my response is "Works as documented, intended, hoped and expected"
15:45 rns Yes, probably I've to elaborate — ambiguous() says "Choices" and then prints "Choice 1" and that's it.
15:45 jeffreykegler I could see applications where you want a fuller enumeration of the ambiguties, but ambiguous() is not intended to do that.
15:46 jeffreykegler Ah!  Now *that* does not sound right.
15:46 jeffreykegler Yes, file the github issue.
15:46 rns Ok, I'll do. Another point ...
15:46 rns rather vague, I have to admit, is that ...
15:46 jeffreykegler It's not much of ambiguity if there is only one choice. :-)
15:47 rns Yes! (ROFL)
15:48 rns But ... ASF and Glade code give different outputs, while both had to enumerate the factorings ....
15:49 rns I take it that ASF's $full_traverser actually enumerates while Galde's low level code just lists factorings and symches whose number is always finite.
15:49 rns s/Galde/Glade.
15:51 jeffreykegler Right.  Same issue, at two levels.  I'm not sure I see this point.
15:51 jeffreykegler Shot in the dark: it might be useful to think of the ASF's as CYK in reverse.
15:51 rns Hmm, interesting ...
15:52 rns The number of ASTs possible from an ASF grows exponentially depending on the number of factorings?
15:52 jeffreykegler That is, CYK, you work bottom-up, look at all pairs, and basically choose some (often statistically) to prune the search space.
15:52 jeffreykegler Yes, exponential growth is all over the place here, and it happens guickly with even mild provocation.
15:53 rns And ASF goes to all possible combinations top-down?
15:53 jeffreykegler So ASF's are the same idea, only top-down.
15:53 jeffreykegler Same idea as CYK, that is, pruning the search space.
15:54 rns Ok, got it.
15:54 rns Whereas Glade (low level) just lists factorings, not trying to enumerate the parses.
15:54 jeffreykegler We *could* enumerate parses, but the only interface did that just fine.
15:55 jeffreykegler * only -> old
15:56 jeffreykegler That is, if you only want to enumerate, Glade will do it, but you've just picked a more complicated way of doing the same thing that you could before ...
15:56 jeffreykegler and, yes, it's not faster than the old way either -- slower if anything.
15:57 jeffreykegler Because the old $recce->value() interface, since it did nothing but enumerate, would just rattle them off one after anotther.
15:58 rns Thanks for explaining, I think I understand it all much better now.
15:59 jeffreykegler That's where the ASF/Glade idea is new -- it's a way of *exploring* ambiguity.  If you only want to enumerate, they're just a pointless complication.
16:01 rns I used ASF::traverse() to build parse-forest grammars which I later pruned and cleaned up to produce AST — as a way to disambguate...
16:02 rns But if the number of parses grows exponential, building a PFG makes no sense ...
16:02 jeffreykegler Right.
16:03 jeffreykegler My idea was that pruning (at least a 1st cut at it) takes place during ASF traversal.
16:03 rns Yep.
16:03 jeffreykegler Because otherwise in NLP exponential growth quickly kills you.
16:04 rns Yes.
16:04 jeffreykegler If you can work up an example illustrating this ...
16:04 jeffreykegler A blog post might be helpful, because I have to think that a lot of people just didn't see what my point was here.
16:05 jeffreykegler The "CYK in reverse" trope might make it clearer, because folks *do* understand CYK ...
16:06 jeffreykegler or at least they use the term a lot. :-)
16:06 rns :)) I'll see what I can do. In a meanwhile, can this https://github.com/rns/MarpaX-ASF-PF​G/blob/master/t/02_expr_number.serve as an example?
16:07 rns it takes ambiguous grammar for arithmetic expressions builds PFG for it, prunes it based on associativity, cleans up and what's remained is the ast.
16:07 jeffreykegler I get a 404 on that link.
16:08 rns Sorry, this one should work https://github.com/rns/MarpaX-ASF-​PFG/blob/master/t/02_expr_number.t
16:09 rns But building the full PFG is also enumerating, not exploring the ambiguity.
16:09 rns So, I thought, can Glade or ASF be used to build a subsets of a PFG that can have still sufficient information to pruner,
16:10 jeffreykegler Yes, that'd be the hope.
16:10 jeffreykegler I'll print it out and read it over.
16:11 rns Great. I'd call it ... lazy enumerating/pruning ...
16:13 rns In the above example, a rule is pruned if
16:13 rns for each node that has a + operator, its right operand cannot be a non-terminal that has a node with a + operator.
16:13 rns per Grune and Jacobs
16:13 jeffreykegler What page?
16:13 rns Searching ...
16:15 rns 3.7.4 Parse-Forest Grammars, 2008'd edition, will give the page number in a moment ...
16:16 rns Page 91 — http://goo.gl/pJWvSZ
16:16 jeffreykegler Found it.
16:16 rns Ok. So ...
16:17 rns if, e.g. for a addition rule we can find RHSes for all its RHS symbols, we can prune them one by one by checking for the right operand.
16:18 rns Not really enumerating all parses.
16:18 jeffreykegler Right.  Let me read this.
16:18 rns Ok. One more thins — Grune's example (sum of digits) is rather simple — https://github.com/rns/MarpaX-ASF​-PFG/blob/master/t/01_sum_digit.t
16:19 jeffreykegler Yes?
16:21 rns The code for the Grune's example (sum of digits) — it's in the above link.
16:21 rns Oh and the module — https://github.com/rns/MarpaX-ASF-PF​G/blob/master/lib/MarpaX/ASF/PFG.pm
16:22 jeffreykegler Right.  You're building up to a question?
16:22 rns Well, just reiterating now that it must be a bit clearer ...
16:23 rns Can in the context of som
16:24 rns Sorry,
16:24 rns restarting
16:25 rns Can only a subset of PFG which can be used to apply the pruning criteria (right operand having no +) from the above test case 01_sum_digit.t, be built by low-level ASF interface (Glade)?
16:26 jeffreykegler Let me look & think this over.  I don't want to answer too hastily.
16:26 rns Sure, thanks a lot for looking into this.
16:27 jeffreykegler Thanks for your interest & support!
16:27 jeffreykegler AFK
16:27 rns Good luck!
17:54 jdurand_ joined #marpa
18:32 rns jeffreykegler: a tutorial on metalua — http://lua-users.org/lists/​lua-l/2006-11/msg00343.html — seems to be easier to get started.
18:32 rns Didn't tried it yet though, gonna try on the weekend if time permits.
19:11 ilbot3 joined #marpa
19:11 Topic for #marpa is now Start here: http://savage.net.au/Marpa.html - Pastebin: http://scsys.co.uk:8002/marpa - Jeffrey's Marpa site: http://jeffreykegler.github.io/Marpa-web-site/ - IRC log: http://irclog.perlgeek.de/marpa/today
20:42 flaviu joined #marpa

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary