Time 
Nick 
Message 
00:51 

Idiosyncrat joined #marpa 
03:01 

VsyachePuz joined #marpa 
04:19 

ronsavage joined #marpa 
05:15 

Idiosyncrat joined #marpa 
06:17 
ronsavage 
Hey! I'm just signed off $work for the year. Now, ..... 
06:19 
ronsavage 
Oops. I'm => I've, And no, I'm /not/ full of Xmas cheer (yet). 
07:19 

koo7 joined #marpa 
16:00 

JPGainsborough joined #marpa 
17:08 

Idiosyncrat joined #marpa 
17:11 
Idiosyncrat 
http://www.antlr.org/papers/LLstarPLDI11.pdf 
17:11 
Idiosyncrat 
I went back and looked at the ANTLR article for complexity claims. 
17:12 
Idiosyncrat 
It's a little hard to unpack, but I think Parr&Fisher do not claim better than exponential. 
17:13 
Idiosyncrat 
They do claim linear if you turn on memoization and change to a PEG strategy. 
17:14 
Idiosyncrat 
What they try to do is to use the best available topdown strategy for each grammar, ramping up gradually. 
17:16 
Idiosyncrat 
Left recursion is not handled. 
17:17 
Idiosyncrat 
The claim of "linear in practice" is a bit circular  it's likely than if your grammar is linear for any *topdown* parser, 
17:17 
Idiosyncrat 
it is linear for ANTLR. 
17:18 
Idiosyncrat 
So if parsing "practice" to you means topdown (which it does for Parr&Fisher and much of the profession these days), 
17:19 
Idiosyncrat 
then that claim is viable. 
17:20 
Idiosyncrat 
But certainly in the practice of Marpa there are lots of grammars people can and do use in practice which are not linear for ANTLR. 
17:20 
ceridwen 
Idiosyncrat, Any specific examples you can share? 
17:20 
Idiosyncrat 
e 
17:20 
Idiosyncrat 
Examples of what? 
17:22 
ceridwen 
Marpa grammars that are not linear for ANTLR. 
17:23 
Idiosyncrat 
The one at the start of section 2 in Parr&Fisher, which they report goes exponential in certains cases for ANTLR 
17:23 
ceridwen 
A simple topdown parser with memoization is worstcase polynomial. 
17:24 
ceridwen 
So I'd be surprise if ANTLR is worstcase exponential? 
17:24 
ceridwen 
*surprised 
17:24 
Idiosyncrat 
I believe is linear for Marpa. 
17:25 
Idiosyncrat 
The SLIF's grammar is ambiguous, so I doubt it is linear for ANTLR 
17:25 
Idiosyncrat 
Also, anything left recursive. 
17:27 
Idiosyncrat 
Where's the "worstcase polynomial" result from  it's not in Parr&Fisher. 
17:28 
Idiosyncrat 
Perhaps it's not for general topdown with backtracking, but for PEG? 
17:33 
ceridwen 
http://www.aclweb.org/anthology/J911004 
17:34 
ceridwen 
This is where I've seen the result cited, I haven't gone over it in detail myself. 
17:34 
ceridwen 
PEGs of course are linear with memoization. 
17:42 
Idiosyncrat 
The Norvig claim is for a special technique, one not used in Parr&Fisher 
17:43 
Idiosyncrat 
And not a general claim for simple topdown parsers w/ memoization. 
17:43 
ceridwen 
I agreeI should have said, "Can be worstcase polynomial." I'm not sure what ANTLR 3's actual worstcase is. 
17:45 
ceridwen 
In the later paper http://www.antlr.org/papers/allstartechreport.pdf, they prove an O(n^4) bound for ANTLR 4. 
17:45 
Idiosyncrat 
"We should point out that without memoization, back 
17:45 
Idiosyncrat 
tracking parsers are exponentially complex in the worst case" 
17:45 
ceridwen 
They have optional memoization, but they don't clarify how exactly they're memoizing. 
17:46 
Idiosyncrat 
From Parr&Fisher, last graf in section 6 
17:46 
ceridwen 
So it could be polynomialbounded, but we simply don't know without digging through the code. 
17:47 
ceridwen 
"Backtracking in 
17:47 
ceridwen 
versions of ANTLR prior to 3.x suffered from exponential 
17:47 
ceridwen 
time complexity without memoization. Ford also solved this 
17:47 
ceridwen 
problem by introducing packrat parsers. ANTLR 3.x users 
17:47 
ceridwen 
can turn on memoization with an option." 
17:49 
ceridwen 
That's from the secondtolast paragraph in section 7, same paper. 
17:49 
Idiosyncrat 
Right, ANTLR 3 was linear if you switched to PEG mode, exponential worstcase. 
17:50 
ceridwen 
Okay, yes, I think we agree. (I found some of these papers a bit confusing too.) 
17:50 
Idiosyncrat 
Not exactly the clearest way of stating it, but I think that's what they are saying. 
17:50 
Idiosyncrat 
Thanks for pointing out the ANTLR 4 paper .... 
17:51 
Idiosyncrat 
this does seem to have a nonexponential worst case 
17:52 
ceridwen 
They definitely changed approaches for ANTLR 4. 
17:52 
ceridwen 
Yes. 
17:54 
Idiosyncrat 
If I were convinced that topdown was as good as you could do ... 
17:54 
Idiosyncrat 
I would either study ANTLR very closely, or ... 
17:55 
Idiosyncrat 
just give up, and refer everybody to Parr's work. :) 
18:01 
Idiosyncrat 
Also, the layout of the ANTLR 4 paper is better  you can spot his claims w/ a quick scan  in the Parr&Fisher it's an exercise in decipherment. 
18:03 
ceridwen 
I'm choosing the former, at the moment :). I don't actually believe topdown is as good as you can do, for some values of "as good." I think there are still questions I haven't seen answered to my satisfaction, such as what the constant factors are for the various algorithms compared to each other. (I mean, we know that Valiant's algorithm with CoppersmithWinograd has the best asymptotic time for some highlyambiguous inputs, but the constant fa 
18:03 
ceridwen 
ctors are so bad...) 
18:06 
Idiosyncrat 
Actually, do we really *know* that about Valiant ... 
18:06 
Idiosyncrat 
I mean that is what everybody says, but I don't think there has been much in the way of trying to make Valiant practical. 
18:07 
Idiosyncrat 
Because of course, just quickly writing the algorithm off because the constant factors are hopeless is exactly what was done 
18:07 
ceridwen 
I've wondered about that myself. 
18:07 
Idiosyncrat 
with Earley's. 
18:08 
Idiosyncrat 
And they stuck to that "bad constant factor" line as CPU/$$ improved by a factor of a billion or so. 
18:08 
ceridwen 
At least for CoppersmithWinograd, I think the arguments still hold (only for matrices larger than current hardware can hold). 
18:09 
Idiosyncrat 
I've never really studied Valiant/Coppersmith/Winograd, but I would not take the textbook assertions about bad constant factors overly seriously. 
18:10 
Idiosyncrat 
I did see a paper about Valiant's algorithm some months ago, where they did try to implement, and found the situation not to be as bad as feared. 
18:11 
ceridwen 
Oh, which paper? 
18:11 
Idiosyncrat 
I'll try to find it. 
18:11 
ceridwen 
The only discussion I've found is in https://www.reddit.com/r/haskell/comments/3b4ztr/a_neat_way_of_doing_nongreedy_parsing_with_parsec/csjgfnb , and they cite this paper: http://www.cse.chalmers.se/~bernardy/PP.pdf 
18:15 
Idiosyncrat 
http://www.cse.chalmers.se/~bernardy/PP.pdf 
18:15 
Idiosyncrat 
Ah, you beat me to it. 
18:18 
Idiosyncrat 
I had grave doubts about one aspect of the paper, which talked about the complexity bounds on the assumption that there were limits on recursive depth ... 
18:19 
Idiosyncrat 
which I cannot convince myself is a legitimate approach ... 
18:19 
Idiosyncrat 
it seems to me a bit like doing number theory on the assumption there is a maximum integer. 
18:20 
Idiosyncrat 
That aside, I found their stuff about Valiant's intriguing and thought they were very much to be praised for going beyond the quick "bad constant factor" dismissal 
18:24 
ceridwen 
Thanks :). I haven't gotten a chance to read it yet, but I will have to do so. My personal suspicion is that if there's something to be gained from Valiant's algorithm, it's going to be from applying changes in hardware capabilities like parallelization or vectorization. 
18:25 
ceridwen 
I haven't had much time to put into it, but I do wonder. 
18:26 
ceridwen 
Of course, other algorithms like GLL or GLR could also potentially benefit from parallelization. 
18:26 
Idiosyncrat 
In the reddit discussion that you cite, edwardkmett seems to suggest that the parallelization is a distraction. 
18:26 
ceridwen 
Yers. 
18:27 
Idiosyncrat 
And reading the paper, I even wondered if that aspect had been played up to get the paper accepted. 
18:27 
ceridwen 
It's quite possible there's nothing there. 
18:27 
Idiosyncrat 
Parallelization is fashionable, and Valiant's is not. 
18:29 
ceridwen 
I agree that in general the importance of parallelization has been overstatedI haven't studied in this particular problem enough to say one way or the other here. For parsing in general, my take is similar to Grune and Jacobs': parsing is not really a naturally parallel problem, and the many attempts so far haven't come up with anything amazing. 
18:30 
ceridwen 
There's been a truly massive amount of work in creating software and hardware for fast matrix multiplication, though, which does make me wonder. 
18:38 
ceridwen 
It's interesting to me the old algorithms for parsing LLregular grammars use two passes, the first reading from right to left. 
18:45 
ceridwen 
That supports your theory that you have to use the right context :). 
20:28 
ernimril 
I think it is very important to be able to parse several sources in parallel, but that is not the same as making the parser parallel, it may require some things (like the grammar) to be thread safe, but I think that is how it mostly ends up anyway 
21:43 

ronsavage joined #marpa 