Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2014-09-06

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:11 ronsavage daxim: When I said '\' was invalid, I meant as a Perl string. But you're right, it's a valid Marpa bnf string.
01:19 ronsavage rns: Your code slightly modified: https://gist.github.com/ronsavage/9fd213e768d274301787. For me, (see gist comment) cases 1, 4, 5 all resolve to the same string and are exactly what I want. Cases 2, 3 fail as expected.
01:19 ronsavage Now I just need to get the event stuff working with the the bnf as in that gist.
01:48 ronsavage And here's the event-driven version: https://gist.github.com/ronsavage/ca9bcb0804e0782e9ad1 with output in the comments. $many x $thanx to all contributors.
02:34 jeffreykegler joined #marpa
02:36 jeffreykegler ronsavage: I've update my personal Marpa site with new top-level language, as "tested" in my recent blog post.  You may, at your convenience, want to adapt this for the top-level of your site.
05:23 hippietrail joined #marpa
05:48 hippietrail joined #marpa
07:38 rns joined #marpa
07:42 rns joined #marpa
07:44 rns ronsavage: re http://irclog.perlgeek.de/marpa/2014-09-06#i_9307170 — looks good, glad it worked.
07:44 hippietrail joined #marpa
09:05 hippietrail joined #marpa
11:37 daxim joined #marpa
13:33 jdurand joined #marpa
13:34 jdurand Jeffrey, I was thinking to your dev environment, aside VS 2008 proposed by rns, a good proposition (I always felt that 2008 speed is much much better than 2010), there is also Code::Blocks (http://sourceforge.net/projects/codeblocks), that I never tried btw - anybody having experience with Code::Blocks, in particular on Windows ?
14:45 hippietrail joined #marpa
14:47 jeffreykegler joined #marpa
14:59 hippietrail hi jeffreykegler
15:23 jeffreykegler hi
15:23 jeffreykegler hippietrail: your series of Lao questions have been interesting
15:24 hippietrail thanks. it's hard to explain without explaining how lao spelling works (-:
15:25 hippietrail and it's proving hard for me to analogize into ascii stuff, since i'm using the parsing to learn how the syllables work as i go
15:25 jeffreykegler Is your Lao effort open source?
15:25 hippietrail well there's not really anything to show yet
15:25 jeffreykegler Gtocha
15:25 jeffreykegler * Gtocha => Gotcha
15:26 * jeffreykegler is trying to type before he's had his coffee
15:26 hippietrail but i'm trying to see if it's possible to come up with something that lets you step syllable by syllable through lao text, since it's a language that doesn't use spaces between words
15:27 hippietrail i can pastebin the current versions of my code if you want to look at it
15:27 jeffreykegler This may be similar (or not), but I've though a lot of Sanskrit sandhi parsing -- very different, but a sort of similar problem level.
15:27 jeffreykegler Or start a github repository
15:27 hippietrail well sanskrit is usually written in devanagari and devanagari uses a logical ordering in unicode where thai and lao use a visual ordering
15:28 hippietrail i don't think it's ready for github. i could stick it in a gist
15:29 hippietrail so far i'm testing it on random lo.wikipedia pages (text only, no html) and tweaking the grammar while comparing it to the inadequate info on lao on the internet
15:29 jeffreykegler As you prefer, but I think (and there are many examples) that there's no such thing as "not ready for github" -- if it's got a project name and a one line description, it's ready
15:29 hippietrail it doesn't have a project name other than "yet another toy lao syllable grammar" (-:
15:30 jeffreykegler YATLSG
15:31 hippietrail sanskrit you can probably handle with a lexer. lao works better with a parser since vowel symbols can come before or after consonant symbols visually and in the text, but phonically the vowels always follow the consonants
15:31 hippietrail also there are two letters which function as both consonant and vowel depending on context
15:31 jeffreykegler It does sound interesting.
15:32 hippietrail and vowel sounds may need one, two, three, or even four symbols around a consonant
15:32 hippietrail yeah i've been racking my brain with it for weeks. i have bisonish and pegish grammars that worked up to a point but didn't help find real life ambiguities
15:33 hippietrail i actually have two versions: one which tries to treat the vowels in a generic manner, and one which only accepts legal combinations fo vowel symbols
15:33 jeffreykegler Is there a maximum length of the input?  For C/C++ you have to be ready to parse arbitrary length efficiently, but for natural languages you can often assume that (at least) average length will be short.
15:35 hippietrail http://pastebin.com/fqj9UcXN
15:36 hippietrail no maximum length. but i'm not parsing full sentences. just syllable structure. i think maximum length of a single syllable would be maybe 7 characters or so
15:38 hippietrail by the way, a JavaScript port of Marpa would be great (-: http://softwarerecs.stackexchange.com/questions/11160
15:38 jeffreykegler Maybe if we could find a funder.
15:38 hippietrail here is one of the docs i'm using as a reference to the lao vowel combinations: http://www.thailao.net/laovowel.htm
15:39 hippietrail the only earley parser i can find for js is extremely rudimentary
15:41 jeffreykegler Most Earley parsers are just copies of Jay Earley's original algorithm from his ~1970 paper, often faithfully including the bug in his paper.
15:41 hippietrail perl is #1 by far for unicode support, but with the fatal flaw that it doesn't output unicode to the windows console. in fact node.js is the only scripting language which does support unicode on the console cross platform
15:41 hippietrail yes in fact the js earley i found is just a recognizer i think and doesn't have a way to access the forest
15:42 hippietrail i also couldn't find a GLR parser for js
15:42 jeffreykegler GLR does not seem to be much used -- supposedly it is an option with bison.
15:43 hippietrail the other reason i moved away from perl finally was that i never really got to grips with accessing members of deep structures with all the pesky sigils and braces
15:43 hippietrail yes but it hasn't been ported to jison
15:44 hippietrail by the way you might want to look at the jison and pegjs webpages - they both have a feature where you can play with a grammar right in the browser in real time without even a login
15:44 jeffreykegler I'm not a Perl promoter -- for me it is the tool I happen to use, mainly because of CPANtesters.
15:44 hippietrail like all languages perl has strengths and weaknesses, promoters and detractors. js certainly does too.
15:45 jeffreykegler ronsavage: re hippietrail's suggestion -- something like that might be nice
15:46 jeffreykegler ... for the web page
15:46 hippietrail it would be extra work for perl since it would need to round trip to a server doing the work where the js ones just do all the work in the user's browser
15:46 jeffreykegler Good point.
15:47 hippietrail so could get slow or expensive if it got popular
15:47 jeffreykegler Another good point.
15:48 hippietrail but somebody might enjoy making an open source version for people to run on their own server
15:48 jeffreykegler hippietrail: best for the time being may be to post specific questions as you hit obstacles --
15:48 jeffreykegler ... at least until the rest of us come up to speed in Lao. :-)
15:49 hippietrail yes it's tricky to de-lao-ize them (-:
15:49 jeffreykegler Rather than do-lao-ize, just shorten them as much as possible.
15:50 hippietrail i found it stated in quite a few places that lao syllables are always clearly delineated. but i doubted this due to their complicated structure and ambiguous letters (-:
15:50 hippietrail one problem with shortening them, especially with the version in the pastebin, is how many syllable productions there are.
15:50 jeffreykegler I noticed with Sanskrit sandhi it is stated as if it's a consistent set of rules, which I came to very much doubt.
15:51 hippietrail yes basically i'm doing this to find out if the statement is true to satisfy my curiosity (-:
15:51 jeffreykegler Where shortening is not practical, try to focus them, by having very specific test cases.
15:51 hippietrail with the bonuses that i love playing with parsers and parser generators and haven't played with them for years now
15:52 jeffreykegler I'm hoping Marpa opens new vistas of fun
15:52 hippietrail and that i will become much more familiar with lao writing
15:52 hippietrail i love how it can handle programming languages and natural languages. there's a lot of interesting stuff between those two to explore.
15:53 jeffreykegler But I don't think it is necessary to do-Lao-ize -- and it may be counter-productive.
15:53 hippietrail ok. i'll try to keep all the lao-ish-ness in my future questions.
15:53 jeffreykegler hippietrail -- such as making the two (programming & natural languages) come much closer together.
15:54 hippietrail i was happily surprised to find a SO js question went down well with the lao stuff left in
15:55 hippietrail well i've always been interested in both. played with parsing and generated and analysing code and text and various foreign languages
15:55 jeffreykegler The problem with do-Lao-ize is that when you abstract from Lao, you may be limiting the kind of answer you get
15:55 hippietrail i have another idea to try using it for morphological analysis
15:55 jeffreykegler ... and possibly excluding the best answers
15:56 hippietrail there's another problem. when you simplify you often get "why don't you just do ..." answers. then you have to reintroduce some of the stuff you simplified out to explain why such shortcuts aren't what you need.
15:56 jeffreykegler I'm surprised nobody tried to do a practical general parsing tool before -- which was my original Marpa motivation.
15:57 hippietrail yes my first open source release was a literal word by word translator, very crappy
15:57 jeffreykegler hippietrail: re http://irclog.perlgeek.de/marpa/2014-09-06#i_9309129 -- yes exactly.  Best to just hit them with the original Lao issue IMHO.
15:58 jeffreykegler If they wind up learning a little Lao, that's not a bad thing. :-)
15:58 hippietrail but i was always frustrated because i knew you absolutely always have to deal with ambiguity at every level when handling natural languages. and that's not easy to roll for yourself and tools are scarce.
15:59 jeffreykegler So I'll be looking forward to a long series of Lao questions.
15:59 hippietrail well we'll see. i'm in the middle of two double shifts at work right now then i'm taking a break for a week or so.
16:00 jeffreykegler Good luck.
16:00 hippietrail thanks. and thanks for marpa!
17:29 jeffreykegler joined #marpa
17:30 jeffreykegler jdurand: re http://irclog.perlgeek.de/marpa/2014-09-06#i_9308504 -- thanks
20:25 jeffreykegler joined #marpa
20:26 jeffreykegler I'm at a coffee shop so can't listen to the talk, but the goings-on within Scala look very interesting. -- https://news.ycombinator.com/item?id=8276565
20:32 Aria Interesting!
20:35 Aria 'ave you got a login to lobste.rs?
20:36 Aria Kind of a hackernews alternate, invite only, and leans more toward the theory and away from 'lookit what I made'
20:36 Aria Also notably civil.
20:39 jeffreykegler Aria: I've never heard of lobste.rs
20:39 jeffreykegler Aria: btw you were working on a JS port of Marpa at one point IIRC
20:39 Aria Yeah. Still intending to come back to it.
20:40 Aria New job kinda took my spare time for a bit, but getting there.
20:40 jeffreykegler I recall we got some timings
20:40 jeffreykegler Did it come at 10x slower for bit twiddling?
20:41 jeffreykegler Anyway, when you do get back to, if you want, ask me for suggestions -- you can be my guinea pig for a new Marpa architecture. :=)
20:41 Aria 5x.
20:42 Aria And I'd love to.
20:42 jeffreykegler * :=) ->:-)
20:42 Aria Anyway, if you'd like an invite to lobste.rs, I'd be happy to give you one.
20:42 Aria For all here, really.
20:43 jeffreykegler My own situation is that I try to keep up with the chat forums that become important, but I'm certainly not on the look-out for new ones.
20:44 Aria I'd figure. It's enough of a repository of interesting things and excellent people that I find it of value -- in a "once a month I look at it" sort of way.
20:44 jeffreykegler Re the new Marpa architecture -- the core engine would handle internal form grammars only -- a lot of stuff would be moved to a layer above.
20:45 jeffreykegler "Internal" means all rewrites are already done, so that the engine can rely on a very restricted form on the grammar ...
20:45 jeffreykegler and optimize the heck out of it.
20:46 Aria Nice. That makes a lot of sense.
20:46 jeffreykegler The current architecture lets a lot of complexity thru to the parse engine, and also has two translation layers above it
20:46 * Aria nods
20:46 jeffreykegler ... it's one of those "evolved" architectures.
20:47 jeffreykegler So I'd outline for you a much cleaner architecture.
20:47 jeffreykegler And one that allows you to develop more incrementally.
20:51 Aria Yeah. I've been looking for clean bits of the layers as I read.
21:01 ronsavage joined #marpa
21:25 idiosyncrat joined #marpa
22:12 ronsavage jeffreykegler: I've updated my Marpa page, although I re-wrote the last sentence of the 2nd last para. Also, I added a new category: Other Parsers. But I don't want this category to dominate, so ... we'll see.
22:14 idiosyncrat ronsavage: there are other parsers? :-)
23:41 ronsavage idiosyncrat: Hahahahaha.
23:57 jeffreykegler joined #marpa

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary