Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2014-12-11

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:05 ronsavage joined #marpa
00:17 koo6 joined #marpa
00:18 slothmachine joined #marpa
03:30 ronsavage joined #marpa
04:32 jeffreykegler joined #marpa
04:33 jeffreykegler I think (perhaps I'm wrong) that a few people are trying to duplicate the Marpa algorithm in another language from scratch ...
04:33 jeffreykegler a goal of which I am very much supportive.
04:34 jeffreykegler One step toward getting full Marpa clones out there ...
04:34 jeffreykegler might be to create a wiki with pseudo-code.
04:35 jeffreykegler If someone wanted to do that, I'd contribute as best I could work it in with Kollos and other stuff.
04:35 jeffreykegler As a basis for the pseudo-code, there is the pseudo-code in the Marpa theory paper:
04:36 jeffreykegler https://docs.google.com/a/jeffreykegler.co​m/file/d/0B9_mR_M2zOc4SjctSE1kTXpVMkE/edit
04:37 jeffreykegler I'm very pleased to note that if you Google "Marpa theory paper", that's the first hit.  Nice.
04:38 Aria Nice!
04:38 Aria I was just thinking something like that would be fantastic.
04:38 jeffreykegler A first note about it.  The pseudo-code in the theory paper includes the Aycock&Horspool LR(0) states, which my original implementation used.
04:39 jeffreykegler I found these counter-productive.
04:39 Aria I'm still wrapping my head around the Leo improvements right now. With my limited time and attention, that's proving difficult.
04:41 * jeffreykegler thinks he'd better complete his thought about LR(0) states before responding to Aria, though he's very pleased at the positive reaction.
04:42 jeffreykegler To eliminate the LR(0) states, just treat them as if each contained a single dotted rule, so that all the Marpa items now become traditional Earley items.
04:43 jeffreykegler If someone wants to attempt this rewrite as a wiki, I'll be happy to review and make sure it's correct.
04:43 jeffreykegler Aria: re http://irclog.perlgeek.de/​marpa/2014-12-11#i_9788557 -- good!
04:45 jeffreykegler I think Loup Valliant is also heading in that direction: http://loup-vaillant.fr/tutorials/earley-parsing/ -- although maybe he prefers to work alone.
04:47 jeffreykegler and pczarn's http://irclog.perlgeek.de/​marpa/2014-12-10#i_9787342 was what made me think there might be interest in something like this.
04:47 Aria Oooh. And he's summarized the change.
04:48 Aria Still stuffed with greek that my brain wants to gloss over, but this is much more accessible than the paper.
04:48 jeffreykegler How's that?
04:48 Aria Definitely useful.
04:48 jeffreykegler What "change"?
04:50 jeffreykegler http://loup-vaillant.fr/tutorial​s/earley-parsing/right-recursion -- OMG
04:50 jeffreykegler I hadn't noticed that -- yes, it looks like a writeup of Leo's algorithm -- I'll have to check it out.
04:51 jeffreykegler Loup's preamble is strange though -- he says about it: "its performance benefits are unclear —they need to be tested"
04:52 jeffreykegler Loup doesn't seem to be aware that I've implemented it and Leo's 1991 algorithm is right now the basis of a significant and growing code base.
04:53 Aria Yeah, definitely
04:55 ronsavage jeffreykegler: Hmmm - I'm beginning to suspect Loup deliberately avoids reading your publications because he does not want to be dependent on you. Or - perhaps the same thing - he does not want to acknowledge that you've surpassed him theoretically.
04:57 jeffreykegler ronsavage: I think that's kind of uncharitable -- I know that, for myself, I have to be ruthless about what literature I decide to delve into or not.
04:58 jeffreykegler Anyway, if anyone wants to investigate the kind of thought process needed to create Marpa, you can read Loup's series ...
04:59 jeffreykegler He's taken the first steps, but has missed some things that I found.
04:59 jeffreykegler :-)
05:05 jeffreykegler By the way, big storm predicted for California, where I live, and power outages may occur, so if I disappear for a few days, that will be why.
05:11 jeffreykegler I may write up some comments on Loup's piece -- particularly the questions he poses -- because of course I've solved all of them.
05:11 jeffreykegler I'll let this channel know first.
05:13 ronsavage OK. I could be wrong of course.
05:15 ronsavage Nevertheless, he could ignore almost everything, but still study your work.
05:18 jeffreykegler My paper is not published via the traditional avenues -- these have turned disfunctional, but nonetheless they still carry considerable prestige.
05:19 jeffreykegler And with some justice, since these papers *did* get refereed.
05:20 jeffreykegler So for that reason, folks go to Joop Leo's paper instead of mine, ...
05:21 jeffreykegler which is a bit ironic, because Joop's marvelous work was (almost) completely ignored until I did mine.
05:23 ronsavage Have you submitted it to a traditional avenue? Os is the dysfunctionality so bad?
05:23 ronsavage Os -> Or
05:24 jeffreykegler It's that bad.  The much-discussed Might&Darais paper was never actually published -- it was rejected, and so they put it on the Internet.
05:24 jeffreykegler And I do *not* have an academic position.  Might & Darais did -- they were professor & grad student.
05:25 ronsavage OK. And is this just because of the lack of progress before your work, or are the editors so inward-looking you choose not to deal with them?
05:26 jeffreykegler Parsing is really, really, really, really out of fashion.
05:26 jeffreykegler The meltdown of bottom-up parsing has made nobody want to go near it.
05:26 Aria Huh. I just compiled libmarpa with emscripten
05:26 ronsavage Academia - sigh. And out of fashion too.
05:26 jeffreykegler Also, ron, publishing life cycles take months ...
05:27 ronsavage Aria: Is that an intermediate step in your work, or just for interest? I assume the former.
05:27 jeffreykegler and they'd often insist on putting it behind a paywall.
05:27 * jeffreykegler assumes that Aria is interested in world conquest
05:27 ronsavage Ah, yes. The paywall issue. Another thing to avoid if possible.
05:27 Aria Just for interest.
05:27 Aria But I'd love to benchmark it.
05:28 jeffreykegler Right, and recall that Joop *did* get published and was (essentially) ignored for two decades ...
05:28 Aria That it took me about ten minutes to do certainly makes it an interesting avenue.
05:28 jeffreykegler which might have extended to another two if I hadn't come along.
05:28 ronsavage "Nature plumps for an open-ish access-ish model" (on this page: http://phenomena.nationalgeographic.​com/2014/12/06/ive-got-your-missing-​links-right-here-06-december-2014/) links to http://www.nature.com/news/nature-promotes​-read-only-sharing-by-subscribers-1.16460.
05:30 jeffreykegler ronsavage: so these days publishing in the traditional way takes a lot of time and effort, for little or no gain ...
05:30 jeffreykegler or worse than no gain, if you wind up behind a paywall.
05:31 ronsavage Then it's fair to say your breakthru does not need to be published in an academic-type journal now..... And that's another reason such journals are in decline :-)
05:31 jeffreykegler Bingo
05:32 jeffreykegler People still publish in the traditonal way when they are academics, and their jobs literally depend on racking up their publication count.
05:32 jeffreykegler I've got no academic job to lose. :-)
05:33 ronsavage Just imagine. If you had a huge amount of free time, you could start you own on-line journal dedicated to parsing. Of course, some would say you've just done that!
05:33 ronsavage Yes, I understand that saying: Publish or Perish. No mention of quality, though.
05:34 jeffreykegler In theory the journals enforce quality.  In the case of parsing theory, they do this by rejecting everything that comes in the door. :-)
05:35 jeffreykegler ronsavage: Remember we talking about grad schools dropping compiler and parsing courses?
05:36 jeffreykegler Aria: actually it is pretty interesting that Libmarpa compiled so easily ...
05:37 jeffreykegler though I would hope so since it *is* straight C90.
05:37 Aria Yeah. Wish I had something using libmarpa to test against for speed. will require more hacking.
05:37 Aria I'd love to know what the cost of that approach is.
05:38 jeffreykegler My guess is that whatever the cost is, it will go down ...
05:38 Aria Quite.
05:38 Aria It ends up being a 300 kb javascript program.
05:38 jeffreykegler when something becomes the kind of focus that Javascript does, it is stunning what kinds of optimization folks can end up with.
05:38 Aria Though that's with dead code elimination on and nothing linking to it.
05:39 Aria It really is. V8 never ceases to amaze me.
05:39 jeffreykegler IIRC correctly 300kb is not even a serious challenge for asm.js
05:39 Aria Yeah. Though v8 doesn't strictly do asm.js -- it just tries to optimize it as best it can like it does everything else.
05:40 Aria And certainly not all in a single function. That's what breaks V8: single functions that are too large.
05:40 Aria (since it tries to keep its compiler able to run in real time)
05:41 jeffreykegler Aria: thanks for finding the new Loup post -- for some reason my searches had missed it.
05:42 Aria You're welcome!
05:43 ronsavage Yes, I remember you mentioning the dropping of such courses. It has /really/ stuck in my mind.
05:45 jeffreykegler ronsavage: OK.  that tells you what academics are thinking about parsing.  These same guys decide what goes in the journal -- so imagine the priority they give to a paper on parsing?
05:46 jeffreykegler This, by the way, is why I chose to implement Marpa, and not just do a paper on it.
05:47 jeffreykegler I saw what happened to Joop Leo's wonderful work, and decided that, if a new approach to parsing was going to have a chance, it would have to be made available *as an implementation*.
05:47 jeffreykegler A downside of that, of course, is that it would have to actually work. :-)
05:49 jeffreykegler As an aside, when I say Joop work was almost ignored, there was an exception.  Grune & Jacobs in their wonderful book, gave Joop pretty good coverage -- which of course also got ignored. :-)
05:49 Aria Hmm. Only thing stopping me from compiling kollos is not having an emscriptened lua.
05:49 Aria Might have to play with that this weekend.
05:51 ronsavage An implementation is definitive, of course. Well, for all of use who choose to acknowledge its existence.
05:52 jeffreykegler Aria: Rather than Kollos, you might want to just compile that pure-Libmarpa JSON parser -- it's json.c in the kollos repo, but doesn't actually use Kollos, which is good because Kollos does not really exist yet.
05:52 Aria Oh excellent.
05:52 Aria I'll do that.
05:53 jeffreykegler I think that json.c file is also in the Libmarpa archive, in a bit more isolated form.
05:54 jeffreykegler ronsavage: Actually, I aim at two communities, with two different ideas of what is "definitive".
05:54 jeffreykegler For practitioners, yes, an implementation is definitive.
05:55 jeffreykegler If you're a theory guy, the writeup, with attendant proofs of correctness and or your complexity claims, is definitive.
05:55 jeffreykegler * and or your complexity claims -> and of your complexity claims
06:00 jeffreykegler Anyway, going back to Loup and my work, I notice that he *does* cite my paper, and goes out of his way to give me credit for an idea at one point.
06:01 jeffreykegler Loup is actually rather generous about it.
06:07 Aria Alright. I'm off to bed. Good luck with the storm, jeffreykegler.
06:09 jeffreykegler Aria: good night.  Actually I'm from Massachusetts where you are now, so should be able to handle anything California throws at me, weather-wise.
06:09 Aria Good good. But it's always the people around you making life difficult ;-)
06:11 ronsavage Ahh. If Loup cites you then perhaps I'd better retract my theory.......
06:11 jeffreykegler And in a very generous way, as well
06:11 ronsavage Even better.
06:12 jeffreykegler My own theory would be that Loup is academic in his approach ...
06:12 ronsavage I've just this moment finished coding the re-write of GraphViz2::Marpa::PathUtils, so I'll work on the docs tomorrow (it's 17:15 here now).
06:12 jeffreykegler so that implementations are relatively invisible to him.
06:13 ronsavage But that doesn't explain things, I feel.
06:13 jeffreykegler Being academic in approach myself, I understand.
06:14 jeffreykegler Because actually there *are* limits to "proof by implementation" ...
06:14 ronsavage Perhaps he thinks there are yet other breakthoughs waiting to be revealed.
06:15 jeffreykegler with an implementation you never know if the next example will break your parser ...
06:15 ronsavage Limits? Yes, IIRC the code of the 4-color problem was, err, a problem.
06:15 jeffreykegler that's where the proofs of correctness and of your speed claims come in.
06:16 ronsavage But even if the parser fails in some cases, that does not prove the underlying theory is faulty. At least, you can always tell yourself that.
06:16 jeffreykegler If the theory is OK, it means the implementation can be fixed.  No theory, and you just don't know.
06:18 jeffreykegler There's the extreme example of Don Knuth, who is supposed to have once written to some saying ...
06:18 jeffreykegler "Here's the algorithm.  I've proved it correct, but I haven't tested it, so I don't know if it works."
06:20 jeffreykegler Going off-line for a while ...
06:27 ronsavage ok.
08:31 jeffreykegler joined #marpa
08:32 jeffreykegler https://github.com/jeffreykegler/ko​llos/blob/master/notes/misc/leo2.md
08:32 jeffreykegler I've written up some comments on Loup's right recursion tutorial, in the form of an "open letter" to Loup.
08:33 jeffreykegler I actually plan to email this to him, but I thought I'd open it up here for comments first.
08:33 jeffreykegler It's late CA time, but I'll backlog in the morning.
08:33 jeffreykegler AFK.
09:18 pczarn joined #marpa
09:58 ronsavage joined #marpa
10:00 ronsavage Contrariwise, just because it's published in an academic journal does not prove the theory is sound.
10:53 pczarn my interface works like this: https://github.com/pczarn/rust-marpa/blob/slif/​marpa-macros/example/ambiguous_grammar_slif.rs
10:54 pczarn it's totally scanning, not scanless
11:06 pczarn The largest advantage is that it expands to code that uses (thin) bindings directly
11:09 pczarn so it's as fast as using thin in C, thanks to procedural macros (similar to D's compile-time regexes)
11:11 pczarn I need a plan what to do next
11:12 pczarn does SLIF use marpa_r_alternative for character classes? can I trace calls to see how SLIF works?
11:15 pczarn I guess it instantiates two grammars
11:32 flaviu joined #marpa
11:33 pczarn joined #marpa
12:35 lwa joined #marpa
12:47 jeffreykegler joined #marpa
12:50 jeffreykegler pczarn: re http://irclog.perlgeek.de/​marpa/2014-12-11#i_9789862 -- It's looks like you're doing the "racing version" of Marpa.
12:52 jeffreykegler I've long wanted somebody to do a BNF -> C -> executable version of Marpa and, yes, BNF -> rust -> executable should be pretty much the same thing.
12:54 jeffreykegler The SLIF, yes, does create a 2nd Marpa grammar (L0) to do lexing.  It uses Perl itself to do character classes via memoized callbacks.  That's partially in Perl/XS, partially in Perl, and there's no built in "trace calls" mechanism.
12:56 jeffreykegler But actually the SLIF's lexing is not fast, though it is fast enough that it's speed issues don't show up in the context of Perl's overhead.
12:56 jeffreykegler But since you're doing the racing version ...
12:56 jeffreykegler why not use Flex or some equivalent, which would be faster.
12:57 jeffreykegler It would be less powerful than a Marpa lexer, but in most application's using Marpa's power for lexing is overkill.
12:58 jeffreykegler Kollos will allow multiple lexers, but initially I plan one based on Lua's pattern, and therefore not modeled after the SLIF.
12:59 jeffreykegler The one thing you might want to do is allow LATM, so that you only search for "acceptable" tokens, unlike in the conventional lexer.
13:00 jeffreykegler This requires a lexer which can be told which subset of lexemes to search for each time -- I don't know if Flex can do that conveniently.
13:01 jeffreykegler Alternatively, you might cobble together a lexer using some open-source regex engine.
13:01 jeffreykegler Back to sleep.  AFK.
14:25 koo6 joined #marpa
14:26 slothmachine joined #marpa
17:49 jdurand_ joined #marpa
18:16 pczarn joined #marpa
18:43 jeffreykegler joined #marpa
19:21 jeffreykegler https://github.com/jeffreykegler/ko​llos/blob/master/notes/misc/leo2.md
19:22 jeffreykegler I put the "Open Letter to Loup" into "final" form
19:37 jdurand_ Jeffrey - in "Marpa's solution is easy" you introduce Marpa for the first time assuming he knows (perhaps he knows) about it
19:37 jeffreykegler He does know about it
19:37 jdurand_ Ok - I'd just change "Marpa's solution is easy" by "<link to marpa>Marpa</link>'s solution is easy"
19:38 jeffreykegler Perhaps.  Anybody who does not know where to find Marpa is going to have *lots* of trouble understanding that letter.
19:43 lucs jeffreykegler: Um, strange wording?: "In Joop Leo's he describes ..."
19:48 jeffreykegler jdurand_: lucs: I just pushed those two fixes.  Thanks!
19:50 lucs You bet.
19:56 jeffreykegler AFK
20:41 flaviu joined #marpa
20:53 ronsavage A new string library for C: http://bstring.sourceforge.net/
20:56 jdurand_ ronsavage: many thnx will look to it - although I totally do not understand a point in its DAQ saying that 'Actually, from a semantics point of view, C/C++ are horribly unportable'
20:56 jdurand_ in its FAQ
20:56 jdurand_ AFK
21:05 ronsavage jdurand: I think it's a reference to point 6 (reduction of undefined scenarios) in the 2nd question from the top.
22:57 jeffreykegler joined #marpa

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary