Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2016-09-05

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:22 PatrickHuber joined #marpa
01:23 PatrickHuber hello
01:26 idiosyncrat_ hi
01:27 PatrickHuber Thanks for participating in the issues for the Pliant library today
01:27 idiosyncrat_ Plaint sort of snuck up on me -- I didn't realize it was that far along
01:27 idiosyncrat_ s/Plaint/Pliant/
01:28 PatrickHuber I've been off and on again with coding for about a year and a half
01:28 idiosyncrat_ I take it that it's pretty much a full Marpa equivalent at this point, in C#
01:28 idiosyncrat_ ?
01:28 PatrickHuber Not quite, you do a lot more precomputation of the grammar
01:29 idiosyncrat_ You mean stuff like precedenced statements?
01:29 PatrickHuber Yes, I have a issue open for that
01:30 idiosyncrat_ Why didn't you choose to wrapper Libmarpa?
01:30 PatrickHuber right now I'm not 100% sure how presedence is determined. I noticed you've been blogging about it a lot recently
01:30 PatrickHuber I like to learn how things work by implementing them
01:30 PatrickHuber it was a hobby project
01:30 idiosyncrat_ Gotcha
01:31 PatrickHuber Hobbs was the person who got me hooked in his Floss Weekly episode
01:31 PatrickHuber always been a minor parsing geek, but the earley algorithm really resonated
01:31 idiosyncrat_ Andrew Rodland++
01:31 PatrickHuber yes
01:32 idiosyncrat_ So as it stands it's fully implemented, and more 'sugar' than Libmarpa, but not all the bells and whistles of Marpa::R2
01:32 idiosyncrat_ Is that correct?
01:33 PatrickHuber I'll have to checkout Marpa::R2 in more detail to answer it fully. I've covered the base algorithm, A&H nullable fix, Leo, Scoott SPPF.
01:33 idiosyncrat_ And your symbols have names and it all works?
01:34 PatrickHuber yes, I don't rewrite the grammar or do mapping if that is what you mean
01:34 PatrickHuber just oop
01:35 * idiosyncrat_ and Patrick have not mentioned but Pliant is in C#
01:36 idiosyncrat_ For others, Pliant is at https://github.com/patrickhuber/Pliant
01:37 PatrickHuber most of the implementation is straight from the papers or what I could pull from the google group conversations and IRC logs
01:37 PatrickHuber the one unique thing I did was implement Leo parse forests with a virtual item
01:37 PatrickHuber instead of expanding the leo items
01:37 idiosyncrat_ Cool
01:38 idiosyncrat_ I take it you haven't implemented runtime events?
01:39 PatrickHuber no, the ambiguity always bothered me with runtime events
01:39 idiosyncrat_ They're a hassle and the more extra features you include the more hassle runtime events become.
01:39 PatrickHuber I guess if you need a streamable api or can disambiguate on your own its OK
01:40 idiosyncrat_ But they're very useful, and modern programmers are used to being able to use procedure logic -- they've been trained on deterministic parsers.
01:40 PatrickHuber gotcha. I noticed the nearley library used events and not SPPFs
01:41 idiosyncrat_ Come to think of it having Leo items in the SPPF's would not conflict with runtime events.  Oops.
01:41 idiosyncrat_ Because SPPF's are created after runtime.
01:41 PatrickHuber I imagine not. They are just evaluated when you traverse the forest
01:41 PatrickHuber I created SPPFs ruing runtime and stitched it together on enumeration
01:42 PatrickHuber so some of the info would be lost
01:42 idiosyncrat_ I don't know much about C#.  Is its use largely in the MS world?
01:46 PatrickHuber yes, very similar to java
01:46 PatrickHuber its a c language
01:46 PatrickHuber c like
01:46 idiosyncrat_ And compiles, so it's as fast as C?
01:46 idiosyncrat_ IIRC that's the case.
01:46 PatrickHuber no, but you get some platform independence
01:46 PatrickHuber its close to c++ in some cases
01:46 PatrickHuber but memory management is managed for you
01:46 PatrickHuber you can go around it, but then you are on your own
01:46 PatrickHuber very similar to the JVM
01:46 PatrickHuber language -> bytecode -> interpreter
01:46 PatrickHuber there is a just in time compile step that gives you the native performance
01:46 PatrickHuber tradeoff is managed memory has global collection
01:46 PatrickHuber so your app can pause
01:46 idiosyncrat_ I hope you understand my preference for interacting on this IRC channel or our Google group -- just as you were able to benefit from the discussions there, so others will be able to benefit from this.
01:46 sirdancealot joined #marpa
01:48 ilbot3 joined #marpa
01:48 Topic for #marpa is now Start here: http://savage.net.au/Marpa.html - Pastebin: http://scsys.co.uk:8002/marpa - Jeffrey's Marpa site: http://jeffreykegler.github.io/Marpa-web-site/ - IRC log: http://irclog.perlgeek.de/marpa/today
01:49 idiosyncrat_ You probably saw that my comment on "knock offs" is now in our FAQ.
01:49 PatrickHuber yes. very similar to the approach I took
01:49 idiosyncrat_ For some reason I didn't pick up that you *had* implemented the Leo algorithm ...
01:49 idiosyncrat_ I guess because most folks doing knock offs try to skip it.
01:50 PatrickHuber I spent a lot of time on it, probably 2-3 months
01:50 PatrickHuber I didn't want to give up on it. i saw the exponential growth in my unit tests
01:50 idiosyncrat_ It's surprisingly counter-intuitive -- in fact, depending on how we count it, yours may be the 2nd practical implementation in 25 years.
01:51 PatrickHuber that would be an honor
01:51 idiosyncrat_ Since it's a hobby, I take it progress on Pliant will have to be slow.
01:52 PatrickHuber For the most part. I've been working about 2-3 hours 5 nights a week or so
01:52 idiosyncrat_ That's a lot if you have a fulltime job.
01:52 PatrickHuber I don't get to program much during work, its a outlet
01:53 PatrickHuber an*
01:53 PatrickHuber I've been rather obsessed with performance the past couple of weeks
01:54 PatrickHuber which is why I ended up looking at A&H DFA transformation
01:54 PatrickHuber I have the precomputation done, and was playing around with the runtime
01:54 idiosyncrat_ At this point I have more experience with implementing the A&H transformation than they do.
01:55 idiosyncrat_ I actually implemented in the Earley versions of Marpa, and ran counts.
01:55 PatrickHuber I believe you :) Did you only use the prediction precomputation in Marpa or did you throw the whole dfa transformation away?
01:56 idiosyncrat_ I threw the whole thing away.
01:56 idiosyncrat_ Predictions can be optimized in other ways.
01:56 PatrickHuber I'm curious about that. I saw the previous logs about transitive closures, but from what I understand they need to be done on a nxn matrix
01:56 idiosyncrat_ Predictions are very special Earley items in that the dot location is always 0, and the origin is always the current Earley set.
01:57 idiosyncrat_ So predictions could be implemented with a per-Earley-set bit map, one bit per rule.
01:58 idiosyncrat_ "nxn matrix" -- you mean to compute predictions?
01:58 PatrickHuber yes, warshall's algorithm is for k=1->|V| , i=1->|V|, j=1->|V|
01:59 PatrickHuber so if you have a 2x3 bit matrix you'll jump out of bounds in the comparision part
01:59 PatrickHuber I noticed you only use the column dimention when calculating transitive_closure
01:59 idiosyncrat_ Libmarpa currently uses a work list to compute the predictions, not a matrix.
02:00 PatrickHuber ok, perhaps I missed that part
02:00 PatrickHuber I have trouble reading the w text format
02:00 idiosyncrat_ "only use the column dimension" -- that's so I can use per-row bit operations.
02:00 idiosyncrat_ In a sparse matrix, many row will be all 0's and you can test for them 64 (or 32) at a time.
02:01 idiosyncrat_ But the work list is more flexible, and I don't think the speed difference is measureable.
02:01 PatrickHuber I noticed dijkstra's algorithm can work as well as warshall's, is that right?
02:02 idiosyncrat_ "Dijsktra's algorithm"?
02:02 PatrickHuber it calculates shortest path in a graph
02:03 idiosyncrat_ https://en.wikipedia.org/w​iki/Dijkstra%27s_algorithm
02:03 PatrickHuber yes, thats it
02:04 idiosyncrat_ Don't know about Dijkstra's algorithm, but I always need *all* connections, not just the shortest one.
02:04 PatrickHuber I think you end up traversing the graph in both cases
02:05 PatrickHuber but as you mentioned Warshall's may be better with the bit math
02:05 PatrickHuber because of the register sizes
02:05 idiosyncrat_ Perhaps it would work, I frankly do not know.
02:05 PatrickHuber I basically describe it as a queue with a hashtable
02:06 PatrickHuber if you have visited a node, it goes into the hash table and you queue up new nodes
02:06 idiosyncrat_ I got the suggestion for Warshall's from Grune & Jacobs, and did not look further.
02:07 idiosyncrat_ Though G&J actually state that they think the worklist method is better.
02:07 PatrickHuber great book btw. only read the earley parts but very clear descriptions
02:07 PatrickHuber it works better for sparse graphs
02:07 PatrickHuber a bit vector as you said has a bunch of 0's
02:08 PatrickHuber or infinities depending on how you implement
02:08 PatrickHuber this is where I found the original idea: http://cstheory.stackexchange.com/a/2493/32787
02:09 PatrickHuber when I was trying to determine nullablitity for symbols in the grammar
02:12 PatrickHuber not 100% sure if it is Dijkstra's algorithm, but this made me think it was: https://en.wikipedia.org/wiki/F​loyd%E2%80%93Warshall_algorithm (see comparison with other shortest path algorithms)
02:12 idiosyncrat_ At this point my biggest optimization and maintenance problems come from my choice of language for the implementation of the various logic levels.
02:12 idiosyncrat_ That is, Marpa::R2 uses C, but then when that's too much uses Perl and that's were 90%++ of my overhead is.
02:13 idiosyncrat_ Which means I don't really sweat the choice of C algorithms too much.
02:14 PatrickHuber I have done very little perl, most was back in 2003-2006 and mostly maintaining code written by another who left the company. Is there overhead going across memory bounds or just in the interpreted nature of perl?
02:14 idiosyncrat_ In theory Warshall's is O(n**2) vs. O(n) but in practice there's no measurable difference.
02:15 idiosyncrat_ G&J say the Warshall's is always O(n**3) but that's one of their rare mistakes.
02:15 PatrickHuber I thought it would be because of the three loops through |V|
02:15 PatrickHuber I noticed you did a low level memory manipulation on the third loop though
02:16 PatrickHuber is that where you get the O(n**2)?
02:16 idiosyncrat_ Or I should say that Warshall's *can be* O(n**2) in favorable cases.
02:16 idiosyncrat_ Warshall's is indeed O(n**3) worst case.
02:16 idiosyncrat_ On the innermost loop IIRC there's a test.
02:17 idiosyncrat_ On optimistic assumptions about its result, you get O(n**2)
02:19 PatrickHuber I'm guessing optimistic is most pratical cases
02:19 idiosyncrat_ That's my experience.
02:21 idiosyncrat_ At this point I'm trying to create new features, and only those optimization improvement which really amount to new features, in the sense they most some techniques into new application areas.
02:22 idiosyncrat_ s/only those optimization improvement/I am most interested in only those optimization improvements/
02:22 PatrickHuber that makes sense
02:23 idiosyncrat_ Back in the 1970's folks salivated over the idea of a parser with the capabilities of Marpa/Leo ...
02:24 idiosyncrat_ but today there's a kind of top-down Stockholm syndrome.
02:24 idiosyncrat_ Folks don't see the potential of real 2nd order languages -- languages which write other languages.
02:24 PatrickHuber The only problem I've found so far with the algorithm is one of its biggest strengths. The state it accumulates and the constant factors.
02:25 PatrickHuber that was one of the biggest "ah ha!" moments for me
02:25 PatrickHuber weaving grammars together is super easy
02:26 PatrickHuber the whole domain specific language concept was a big reason I became interested in parsing
02:28 PatrickHuber not a big IRC user, so trying to reference a previoius point in time for context: Going back to the previous thread on predictions. When you mentioned the bit vector, that makes sense. Put the rules in the vector and the position is implied. Do you just dump the precomputed states into the earley set and avoid the whole prediction phase?
02:28 idiosyncrat_ I think perhaps one reason folks are putting off adopting the improved Earley parsers is a perception they're not from academia, that Marpa, etc. are the kind of folk parsers which appear with lots of claim and no real theory behind them.
02:29 idiosyncrat_ re predictions: actually I use a work list for predictions -- I haven't found the bit vector optimization worth it so far.
02:29 PatrickHuber ok. makes sense
02:30 idiosyncrat_ Work lists enable some advanced features which involve turning off and on individual predictions -- I could set things up so that I bit-vector which I don't prediction-twiddle, but so far it hasn't been worth it.
02:31 idiosyncrat_ Anyway, re Marpa & academia.
02:31 PatrickHuber I come from enterprise development, rarely do my colleagues get into parsing theory. They just want something that works. The Roslyn compiler project has changed some of that
02:31 PatrickHuber That works fast too
02:31 PatrickHuber Basically Roslyn is a compiler as a service
02:31 idiosyncrat_ Actually I have an MSCS Yale, and my professors included Perlis & Ned Irons, who wrote the 1st parser paper, so I didn't exactly emerge from the hills. :-)
02:33 PatrickHuber I'm not as familiar with academia, so sorry if I came across as crass
02:33 idiosyncrat_ That was not ad hominem, that was just me ruminating.
02:34 idiosyncrat_ :-)
02:34 PatrickHuber :)
02:35 idiosyncrat_ It is a little perplexing, which you consider how fast our field usually moves, that a revolutionary advance (Leo's) from 1991 took over 2 decades to get a practical implementation.
02:36 idiosyncrat_ In chip design, for example, it's my understanding that if you don't follow the field for 18 months, you're waaaay behind.
02:36 PatrickHuber I had a lot of trouble implementing it
02:36 PatrickHuber yes, moore's law
02:37 idiosyncrat_ I have to be AFK for a few minutes
02:37 PatrickHuber sure
02:43 idiosyncrat_ Actually, I should be grateful for the slow rate of progress in parsing -- it's been wonderful to find all this stuff, and in other fields I might well have been too slow a thinker to get the nicer results. :-)
02:45 PatrickHuber you did some amazing work
02:47 PatrickHuber So is Marpa::R2 entirely implemented in perl?
02:48 idiosyncrat_ No, it's build on top of a C library -- Libmarpa
02:48 PatrickHuber ok, I thought you had a pure perl implementation as well
02:48 idiosyncrat_ Early versions of Marpa did, but Marpa::R2 never had a pure Perl version.
02:50 idiosyncrat_ Experience showed that few people prefered pure Perl to the speedier hybrid versions, and that having to compile Libmarpa was not the issue that I feared it would be.
02:50 idiosyncrat_ And I was sure happy to stop maintaining two versions. :-)
02:51 PatrickHuber ha, I can imagine
03:00 PatrickHuber had a long weekend, going to sign off. Thanks again for all of the information!
03:11 idiosyncrat_ PatrickHuber: I hope I was helpful.
04:28 shadowpaste "idiosyncrat_" at 217.168.150.38 pasted "FAQ on Precedenced statements and ambiguity" (33 lines) at http://fpaste.scsys.co.uk/533028
04:28 idiosyncrat_ ronsavage: Another entry for the FAQ
04:28 idiosyncrat_ Thanks!
06:46 gleki joined #marpa
07:05 ronsavage joined #marpa
07:25 ronsavage JK: FAQ updated: http://savage.net.au/Perl-mod​ules/html/marpa.faq/faq.html. See Q 141.
08:09 ronsavage joined #marpa
08:20 sirdancealot joined #marpa
08:37 sirdancealot joined #marpa
11:26 sirdancealot joined #marpa
13:45 Idiosyncrat joined #marpa
13:46 Idiosyncrat ronsavage: Thanks!
13:46 Idiosyncrat Some corrections to Q141
13:47 Idiosyncrat In the Q: s/came out ambiguous/come out ambiguous/
13:48 Idiosyncrat First sentence: s/There a section in the/There is a section in the/
13:50 Idiosyncrat s/With this precedenced statement:/Consider this precedenced statement:/
13:50 Idiosyncrat Thanks!
15:45 beaugunderson_ joined #marpa
15:52 VsyachePuz joined #marpa
18:07 gleki hi so no one knows of Earley's parsers with lookahead support?
19:08 sirdancealot joined #marpa
22:38 ronsavage joined #marpa
22:38 ronsavage JK: FAQ Updated.
23:34 ronsavage joined #marpa
23:57 idiosyncrat_ joined #marpa
23:58 idiosyncrat_ ronsavage: Thanks!

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary