Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2015-11-16

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:30 Idiosyncrat joined #marpa
00:31 Idiosyncrat Btw, a theory as to why Joop Leo's result got ignored for 15 years.
00:33 Idiosyncrat One of my former profs, Dick Lipton of Georgia Tech has a term: "galactic algorithms"
00:34 Idiosyncrat By this he means the kind of algorithm that appears in the math literature to prove "there exists an algorithm such that ..."
00:35 Idiosyncrat But that nobody will ever implement, because it's just too complicated and messy in terms of the payoff.
00:36 Idiosyncrat Working up the Leo chapter of my new paper, I see that, while I do not consider it "galactic" in Lipton's sense, there is a lot to be worked out ...
00:36 Idiosyncrat and I think previous workers may have considered it "galactic", though I've never seen anybody say that in print.
00:37 Idiosyncrat Btw, there is a faster general parsing algorithm, called Valliant's, which *is* said to be galactic, and which has never been implemented AFAIK.
00:39 Idiosyncrat https://en.wikipedia.org/wiki/CYK_algorithm#Valiant.27s_algorithm
00:39 Idiosyncrat s/Valliant/Valiant/
00:40 Idiosyncrat But from discussing things with the many folks now knocking off new versions of Earley's ...
00:41 Idiosyncrat I do get the impression that everybody considers Joop's algorithm, if not galactic,
00:41 Idiosyncrat at least "semi-galactic".
00:48 Cheery hey
00:50 Idiosyncrat yo!
00:53 Cheery I ended up making a tool that doesn't do right-recursion optimization. Omitting the few states made it lot more difficult to parse, although it was a correctly working recognizer
00:55 Idiosyncrat You have to start somewhere ...
00:55 Idiosyncrat my first versions of Marpa were written before I knew about Leo's algorithm.
00:56 Idiosyncrat It was my frustration at the speed for certain kinds of grammar that made me look harder into the literature and find Joop's paper.
00:58 Cheery overall this whole stuff seems incredibly difficult.
00:59 Cheery well. parsing is.
01:00 Cheery I ended up doing the parsing by pushing values into one stack and rules into another.
01:01 Cheery whenever there's expandable item in value stack, I push a rule into rule stack and expand the item in the value stack.
01:02 Cheery each rule gets a counter, which is decremented by pushes into the value stack when the rule is topmost in the stack.
01:02 Idiosyncrat_ joined #marpa
01:03 Cheery when counter reaches zero, the rule 'reduces' and the reduced construct is pushed into a third stack.
01:05 Idiosyncrat Sounds like a variant on top-down parsing.
01:07 Cheery I separated the ambiguity checking into a function, which gets sequence of values. If there's one value in that sequence, it returns. Otherwise it raises an error for ambiguity. :)
01:15 Cheery It feels like there would be a neater algorithm in plain sight, but we haven't just figured it out yet.
01:17 Cheery looking at the way how this is structured.. earley in itself seems like an optimization
01:18 Cheery or then you could look at the recognizing -> parsing -step as an optimization.
01:18 Cheery the recognizer doesn't do work on every ambiguous parse, that is then left to the actual parsing step.
01:20 Cheery the LR(0) split-dfa is also meant as an optimization, but you apparently noticed it prevents more useful optimizations
01:23 Idiosyncrat With the LR(0) split-dfa, the Aycock&Horspool paper does not claim any new theoretical results.
01:24 Idiosyncrat And my experience was that it was not an optimization in practice, either.
01:24 Idiosyncrat And it very much got in the way, when it come to evaluating, implementing events, etc., etc.
01:29 Cheery In the NNF, the thing I wonder is how to go about evaluating rules that can become null, but themselves do not have null production rule?
01:30 Cheery the nullable sets tell me which nonterminals are nullable, and I can determine the productions that produce null values.
01:30 Cheery in a way it's simple if there's only one value that can become null.
01:31 Cheery guessing it's just that you got to raise an ambiguity error for such values.
01:33 Cheery I'll try make the NNF version tmr.
01:34 Cheery And looking if I get the leo items implemented.
01:34 Cheery It's been so many times I've not been listened on this kind of things.
01:35 Cheery so I better not repeat the error I've seen so often.
01:38 Idiosyncrat I am not sure I understand the question, but if I have it right ...
01:38 Idiosyncrat this is one I can help with.
01:39 Idiosyncrat Marpa prune every nulled tree back to its root, and the semantics of the tree is the semantics of the root symbol -- all other symbols are ignored.
01:39 Idiosyncrat Initially, I wondered if this simplification would bother users.
01:40 Idiosyncrat But experience has provided the answer -- there was no pushback on this, ever.
01:40 sadmac joined #marpa
01:40 Idiosyncrat Apparently user's intuitively feel this is the right way to handle nulling semantics.
01:41 Idiosyncrat I am not 100% this answer is actually to the question you were asking, but I hope it helps.
01:50 Cheery you mean that when I have X -> A B, where I have A -> $ and B -> $
01:51 Cheery then you have some rule Y -> X t
01:51 Cheery you just produce null on the place of X, rather than try to determine reductions for it?
01:53 Cheery getting sleep.. but reading it out later.
02:11 Idiosyncrat Cheery: you got it
02:11 Idiosyncrat It's a real Gordian knot hack, but it works for the users, so that ...
02:12 Idiosyncrat simple and hackish as it is, it may turn out to be among the real insights from the Marpa project.
02:12 Idiosyncrat The question is "Can I really do that?"
02:13 Idiosyncrat I certainly wondered when I was implementing it, but like I say, it goes over with the users just fine.
05:17 ronsavage joined #marpa
05:59 Cheery I feel too pragmatic today to replace the whole parser.
06:00 Cheery need to write a program that forces me to do it first.
06:01 Cheery yay.
06:01 Cheery maximum recursion depth exceeded.
06:01 Cheery now I can fix it
06:04 Cheery At times I feel I have too many competitors so I can't stop & pull aside for a moment to replace even a small system.
06:39 Cheery yay! fixed a stack overflow issue. :)
06:41 Cheery it takes 10 seconds to compile force_me, that forced me to fix my stuff.
06:41 Cheery it's 5000 lines
06:42 Cheery now I could address that issue too.
06:43 Cheery watch "python compile.py sample.cb sample"
06:45 Cheery goal: replace ordereddict with a list and a per-set-lookup
06:48 Cheery odd.. why am I doing late prediction?
07:04 Cheery ok.. dropping ordereddict didn't exactly give me the performance increment I expected.
07:05 Cheery it was modest drop into 8 seconds.
07:07 Cheery traverse takes the longest time.
07:08 Cheery oh.. I probably shaved the the stepping time into half anyway.
07:11 Cheery lets see if I can reduce the time spent in building the chains.
07:12 Cheery I try replace it with the method I came up with recently.
07:24 Cheery the old one is a second better than the new one.
07:25 Cheery I need to bite it to the heart.
07:39 Cheery ok.
07:39 Cheery some observations. More complex I do out of it, slower it becomes
07:41 Cheery +and turns out what I expected to make it faster didn't actually make it that much faster.
07:41 Cheery if I wouldn't have to traverse the thing, the recognizer itself would take just 2 seconds to run.
07:56 Cheery The possible issue here may be the cache locality
07:57 Cheery first I construct huge bundle of data.
07:57 Cheery then the top-down method blurts bunch of divisive accesses into that data.
07:58 Cheery but then.. on the leaves the accesses converge
08:05 Cheery damn damn.. I feel there's a way to make it really fast. But I'm not sure how to do it.
08:12 Cheery got a funny idea for how to generate the nihilist normal form grammar.
08:21 Cheery did it by binary ordering. :)
08:23 Cheery okay. there it is. cool. it says that s can be null.. but I guess that rule is redundant. It's the trivial case you mentioned.
08:24 Cheery I omit those.
08:25 Cheery the empty input string is the trivial case.
09:00 Cheery also self-recursive rules resulting from nullifying items are trivial.
09:22 Cheery Now I have NNF-variant of the earley algorithm. I decomposed the single chart into transitions and completions lists.
09:24 Cheery so I got linear-time lookups by index. and linear time lookups by index and a terminal.
09:30 Cheery looking at this thing.. since it now finds:
09:30 Cheery s: [([s -> a * s], 0)]
09:30 Cheery [s -> a s * ] 0
09:32 Cheery oh and..
09:32 Cheery [s -> a * ] 2
09:34 Cheery during the completion of that rule.. there starts to form the deterministic path.
09:34 Cheery it finds the: s: [([s -> a * s], 1)]
09:38 Cheery so there it is.. Once I figure out how to handle the right recursion, it may actually be worthwhile to try out.
09:38 Cheery (both in recognizing and parsing)
15:38 koo7 Cheery, marpa in python? for fun?
18:26 Cheery you could say that. I do it to understand. That's my way to kick ass
21:21 ronsavage joined #marpa
22:55 lucs joined #marpa
22:56 user2_ joined #marpa
23:05 aredridel joined #marpa

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary