Perl 6 - the future is here, just unevenly distributed

IRC log for #marpa, 2014-11-19

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
01:57 ronsavage joined #marpa
02:31 jeffreykegler joined #marpa
02:56 jeffreykegler rns: (but for others as well) I notice you're working with the new LUIF.
02:57 jeffreykegler A question that needs to be resolved is what to do about parens -- () -- in BNF statements.
02:58 jeffreykegler Right now, they "hide" the symbols inside them from the semantics.  I did this because it's a good symbolism for it, and because I did not expect to use grouping in the SLIF -- sort of a self-fulfilling expectation.
02:59 jeffreykegler Question is, what to do about parens in the LUIF.  We'll still want to hide symbols, but people have wanted grouping, things like
02:59 jeffreykegler A ::= B (C D)* E
02:59 jeffreykegler and we should let them do that too.
03:00 jeffreykegler But what notation to use for grouping, and what notation to use to hide symbols?
03:06 ronsavage jeffreykegler: How about (! ...) discards and () without ! preserves?
03:08 ronsavage This allows (!!) to discard !, with or without whitespace. So (! !) also discards.
03:09 jeffreykegler That's one thought.  I also thought of -(...) and +(...), with -(...) meaning hidden.  That way you can make the default -- +(...) -- explicit, something which is sometimes handy.
03:09 ronsavage Errr, as long as nobody wants to do (!)* to preserve 0 or more !.
03:11 ronsavage I guess (!) needs to produce a warning that it preserves as a special case.
03:11 ronsavage That is, it only discards if there is more than 1 char inside the (! ...).
03:11 jeffreykegler This is getting kinda complex. :-)
03:12 ronsavage Agreed. Exceptions tend to do that. See English for a literally endless list of examples. (I before i except after c except when it isn't!)
03:13 ronsavage Ahhh. (I before e except after c except when it isn't!)
03:13 jeffreykegler An advantage of + and - is that you could also with it for single symbols without the parens.
03:13 jeffreykegler A ::= B -C +D -E F
03:13 jeffreykegler Where the C and E are hidden
03:13 ronsavage You could also (what's missing?) with it for ....
03:14 jeffreykegler and the B D and F are not hidden, and that's explicity indicated for the D.
03:14 ronsavage So dropping the () is an option?
03:14 jeffreykegler If nothing is being grouped, yes.
03:15 jeffreykegler + and - would unhide and hide (respectively) whatever immediately follows,
03:15 jeffreykegler be it a symbol or a group of symbols in parens
03:15 ronsavage I'm trying to say your text "you could also with it" is missing something between 'also' and 'with'!
03:16 jeffreykegler * you could also with it -> you could also use it
03:16 ronsavage Thanx.
03:16 ronsavage I do find the +/- hard-to-read.
03:16 ronsavage With () it's a bit easier.
03:16 jeffreykegler Bangs (!) do stand out, I'll admit.
03:17 ronsavage I'm already feeling the mental effort imagining what your combo "A ::= B -C +D -E F" actually means.
03:17 jeffreykegler Hmmm.  let's see ...
03:18 jeffreykegler A ::= B !C D !E F
03:18 jeffreykegler Better?
03:18 ronsavage It's a classic case (warning: Which I hate) of not being able to understand code without mental computation)
03:18 jeffreykegler Note also there
03:18 jeffreykegler 's no way of explicitly saying "not hidden"
03:19 ronsavage Unfortunately I read that (!C) as C must be absent.
03:19 jeffreykegler "absent"?
03:19 ronsavage Must be absent in the input stream in order for the rule to fire.
03:19 jeffreykegler Oh wait ...
03:20 jeffreykegler I see -- like as if it were a regex.
03:20 ronsavage Correct
03:21 jeffreykegler A problem I have with coming up with notation is that I think in BNF first in a world where everybody else thinks in terms of regexes, and BNF second or not at all.
03:21 jeffreykegler Anyway, yes, the notation should be as friendly as possible to somebody coming to it with a regex mindset ...
03:21 ronsavage And yes, there's no way to explicitly say "not hidden". But then everything is not hidden be default, so I'm not sure extra terms to indicate that will help much.
03:21 jeffreykegler because that is almost everybody.
03:22 ronsavage And very few uses will come to this /without/ some regexp experience.
03:22 ronsavage uses => users.
03:23 ronsavage I liked the original () because they do remind me of capturing regexps, despite the inner text being discarded.
03:25 ronsavage Originally, I almost suggested more regexp-like stuff (?!...) to discard, meaning (...) would then capture ie group. But I chopped the ? as redundant.
03:25 jeffreykegler When I proposed parens for hiding, everybody hated the idea.  Usually I don't go ahead with that much negative feedback, but in that case I did, and it turned out OK.
03:25 ronsavage Or how about the regexp-like [! ...] to discard.
03:26 jeffreykegler But now that they will actually be used for grouping, things have to change.
03:26 ronsavage Are there any pre-existing uses of [ and ] within a rule?
03:26 ronsavage Rule here being x ::= y, not x ~ y.
03:26 jeffreykegler No, but we might want to reserve them.
03:27 ronsavage Ah.
03:27 jeffreykegler "them" meaning square brackets -- []
03:27 ronsavage Understood.,
03:27 jeffreykegler Parens are spoken for one way or the other, and {} is reserved for co-ordination with Lua, which uses it to define tables.
03:28 jeffreykegler So if we use [] for "hidden", we've fired off our last bullet.
03:29 jeffreykegler By the way other choice is doubling parens to mean hidden, that is, ((...)) hides,
03:29 jeffreykegler while (...) does not.
03:34 ronsavage Doubling is possible. So (\( ... \)) preserves even the (), right?
03:35 jeffreykegler Where did the slashes come from? :-)
03:35 ronsavage Agreed. Keep the [] up your sleeve.
03:35 ronsavage The \ escapes the hidden meaning of the inner ().
03:36 jeffreykegler If you thinking of what's in the parens as literal text, as in a regex, that's not the case.  It's like the SLIF where's it symbols and ...
03:36 jeffreykegler if there is literal text, it is single-quoted.
03:37 ronsavage Ahh. So ('('...')') preserves the inner ()? In that case, ((...)) is not so bad after all.....
03:38 jeffreykegler Someday somebody may do a regex-like interface to libmarpa, but I've no plans to be involved.
03:38 ronsavage Me neither.
03:39 jeffreykegler A regex-ish Marpa interface might be a big hit for somebody, though.
03:40 jeffreykegler At this point the SLIF supports every feature you'd want, except backtracking ...
03:40 ronsavage Still, instead of doubling, (! ...) would discard and ('!' ...) would preserve the !. Ie the ! is literal and discards, but a literal '!' preserves as does any other char, e.g. '%'.
03:40 jeffreykegler folks who want backtracking will have to find some other way of going exponential. :-)
03:41 ronsavage Perhaps they'll just have to go ballistic instead :-)
03:41 jeffreykegler ronsavage: Right.  If it's not in single quotes, it ain't a literal.
03:42 ronsavage Re (! ...). Having to defend my idea is making it grow on me.......
03:42 jeffreykegler There's also one other way of expressing literals -- character classes, like [\d] ...
03:42 jeffreykegler which is actually why we can't use square brackets.
03:43 jeffreykegler rns is doing the implementation, so I will want to hear what he thinks.
03:43 ronsavage Right. So ([!] ...) preserves the '! ...' and (! ...) discards the '...'.
03:43 ronsavage Sure - I was waiting for others to chime in.
03:45 jeffreykegler ronsavage: This time zone slot is one where usually just you and I are around.
03:45 ronsavage Or did you mean to use (bang ...) with bang ~ '!' to preserve the '! ...'? I.e. a literal ! means discard and a lexeme of '!' would be mandatory to preserve it?
03:46 jeffreykegler Huh?
03:46 ronsavage Yes - I normally have myself only much of the time. It's 14:48 pm here now.
03:47 ronsavage Try again: Are you saying.... No - forget it.
03:47 jeffreykegler I think the answer is that if they think they can read a bang inside single quotes, they are free to say ('!' A B C), ...
03:47 jeffreykegler and if they prefer to express it as (bang A B C) with bang ~ '!', they can also do that.
03:47 ronsavage Right. And that preserves?
03:47 ronsavage And that also preserves?
03:48 jeffreykegler Literals are literals are literals
03:48 jeffreykegler and if it's not in single quotes or square brackets, it's either syntax or an error
03:48 ronsavage That's what I was trying to say, actually. A non-string ! as in (! ...) discards, is still looking good.
03:49 ronsavage Yes - syntax (ok) or syntax error
03:49 jeffreykegler So for example
03:49 jeffreykegler (!$#%^#^$ A B C)
03:50 jeffreykegler in the above that first thing is either syntax or (much more probably) a syntax error.
03:50 jeffreykegler You'd have to put those special characters into character classes or single quotes to get them literally.
03:51 jeffreykegler This leaves us free to invent *a lot* of syntax, but hopefully we'll be restrained.
03:52 ronsavage Yes - using non-string chars for syntax leaves us very free. Having to enstring (is there such a word?) literals to escape the syntax-based meaning does not seem onerout to me.
03:53 ronsavage 'onerout' => 'onerous'.
03:53 jeffreykegler 'enstring' sounds good -- I'll tell everybody it's Australian
03:53 ronsavage Done. A newlogism, but not the first.
03:54 jeffreykegler I originally thought 'recce' was Australian, but until I discovered that nobody in Australia had a clue what I was trying to say.
03:55 ronsavage Ahhhh. 'newlogism' is (a) 'neologism'.
03:55 jeffreykegler Adam Kennedy has never heard of it either.
03:55 ronsavage Ha. That's just your accent. <- is a joke because we hear so many Americans on TV.
03:56 jeffreykegler But I think 'recce' was short for 'reconaissance' somewhere in the British Commonwealth.  Or maybe I imagined it.
03:57 ronsavage I've heard of it, often. It means reconnoitre. Although I won't argue about the exact spelling. I would almost write 'reccie' or 'reccy'. Definitely not 'recce'.
03:57 jeffreykegler Anyway, I hope to get input from others about the syntax for grouping and hidden symbols.
03:57 jeffreykegler and I should take a break now.
03:58 lucs When I started looking at the Marpa docs, I thought 'recce' was some kind of Italian word -- couldn't wrap my mind around what was going on, until the "recognizer!" lightbulb went on :)
03:58 ronsavage Such abbreviations are legion here. 30 years ago there was an email joke consisting entirely of them, as a fake test for foreigners.
03:58 ronsavage AFK here too.
03:58 lucs Later folks.
06:44 ronsavage joined #marpa
07:28 rns joined #marpa
07:32 jluis joined #marpa
08:13 jeffreykegler joined #marpa
08:14 jeffreykegler Back to the question of LUIF syntax, one of my rules, when in doubt, is to steal from Larry Wall.
08:14 jeffreykegler Larry is very very good at combining expressiveness & meaning in this area, using the ASCII character set as his medium.
08:15 jeffreykegler In Perl 6, for non-capturing groups, he uses square brackets --
08:15 jeffreykegler A :: B [ C D ] E
08:16 jeffreykegler is how it would look in LUIF
08:16 jeffreykegler For character classes, Larry uses square brackets inside angle brackets --
08:17 jeffreykegler A :: B [ C <[1-9]> D ] E
08:17 jeffreykegler I'm inclined to go with this.
08:18 jeffreykegler I will confess to being a little disappointed at the persistence with which Larry & the Perl 6 effort stick with what is IMHO a dead-end approach to parsing, but ...
08:18 jeffreykegler in this they're no different from much of the programming community, ...
08:20 jeffreykegler and for the purposes of stealing their syntax, this is actually an advantage.  Their conservatism suggest that this syntax will be understandable, even to folks stuck in the traditional approach.
08:21 rns jeffreykegler: re http://irclog.perlgeek.de/marpa/2014-11-19#i_9683797 — following the ()modifier pattern — (  )*, (  )+, (  )? — minus sign can be used as a modifier to denote hiding, e.g. (  )-
08:22 jeffreykegler rns: What about following the Perl 6 conventions, as just above?
08:24 jeffreykegler I'm quite willing to second-guess Larry when it comes to parsing, but
08:24 rns Well, this overrides the meaning of [] and introduces new syntax for character classes; I'm afraid that can work against the principle of least surprise.
08:24 jeffreykegler when it comes to syntax, that's another thing.
08:25 jeffreykegler If we go postfix, we wind up with stuff like
08:25 jeffreykegler A-+
08:26 jeffreykegler or
08:26 jeffreykegler A ::= B (C D E)*- F
08:28 rns Perhaps not if we allow only a single-character modifier?
08:28 jeffreykegler Then there is no way to express both repetition and hiding for the same group.
08:29 rns Well, how about nesting, e.g. A ::= B ( (C D E)* )- F?
08:29 jeffreykegler Er, yeah
08:30 ronsavage It's hard, isn't it?
08:31 jeffreykegler My thinking at the moment is that it's a simple matter of stealing from Larry who's good at this, and has only gotten better over the years.
08:33 jeffreykegler rns: because we already have that separator syntax which we're borrowing from Perl 6
08:34 rns Which is cool, I must say — the separator syntax uses a new symbol to denote a new thing and that's why it works so great in my eyes.
08:35 ronsavage What about A ::= B (C D E)* F for capturing and B !C D E!* for non-capturing? That way, the * can be anything and you have not used up [].
08:36 rns Unlike here, where [...] all of a sudden means hiding and <[...]> means a charclass.
08:36 jeffreykegler With the separator we have stuff like
08:36 ronsavage You don't have to stick to brackets. Or, if not ! then «...» for hidden?
08:36 jeffreykegler ident* %% ','
08:37 jeffreykegler Lua does not do Unicode
08:37 ronsavage Ahhh.
08:37 jeffreykegler Mixing the postfix - and the separators would be a mess, I think,
08:37 rns jeffreykegler: re http://irclog.perlgeek.de/marpa/2014-11-19#i_9685292 — exactly and that's why I like % and %% expressiveness.
08:37 jeffreykegler but
08:38 jeffreykegler [ indent* %% ',' ] looks OK and so does
08:38 jeffreykegler [ indent ]* %% ','
08:38 jeffreykegler and
08:38 jeffreykegler indent* %% [',']
08:41 rns Well, yes it does. However, I can't help feeling that it also does so wth parens.
08:41 jeffreykegler Exactly, parens for non-hiding, square brackets for hiding.
08:42 rns And <[...]> for charclasses?
08:42 jeffreykegler Yes
08:43 rns thinking ...
08:43 jeffreykegler There's no hurry ...
08:44 jeffreykegler While obviously it's got to look OK when we eyeball it, I also think that Larry tends to see implications several steps ahead ...
08:44 jeffreykegler his purposes are a bit different from ours ...
08:45 jeffreykegler and I could be accused of hero-worship here, but hey.
08:45 jeffreykegler In any case these are 1 AM thoughts, California time, so I'll go back to sleep.
08:46 rns Good night, I'll be in touch.
08:46 jeffreykegler AFK
08:50 ronsavage My favourite editor (UltraEdit) is dead. The old version displays a msg that an upgrade is available, but the pop-up can't be dismissed, so that version is useless. And the new version I downloaded auto-aborts, but it least it generates a report as it dies. I'm very angry. I've sent them 3 emails already, and they're all asleep in the USA right now, so it
08:51 ronsavage 's vim or gasp Emacs.
08:52 rns ronsavage: is it possible to just roll back to a previous version?
08:55 rns re: hiding syntax — well, on a second thought, I have to agree that Larry knows better. I becase convinced by comparing
08:56 rns A ::= B ( (C D E)* )- F
08:56 rns and
08:56 rns A ::= B [C D E]* F
08:56 rns s/becase/became/. The latter is obviously cleaner.
09:11 ronsavage No, as I said, the previous version locks up telling you about the upgrade.
09:14 rns Oh, I see. Perhaps toggling-off automatic updates can help?
09:20 ronsavage No - since you can't use the mouse or kbd :-(
09:21 ronsavage A ::= B [C D E]* F is easier to read than the previous version.
09:21 ronsavage rns: What did you think of A ::= B (!C D E) F
09:23 rns Perhaps A ::= B !(C D E) F — in A ::= B (!C D E) F, ! looks applicable to C only.
09:24 ronsavage OK.
09:24 rns Still, I think now that special delimiters for hiding work better than modifiers, whether postfix or prefix.
09:25 ronsavage Try: A ::= B (!:C D E) F as in Perl 5 regexps?
09:26 ronsavage Since Jeffrey doesn't like to use up [], tell me what use Perl 6 makes of [] (in case we want to adopt what they use [] for).
09:28 ronsavage Is the problem that we are trying to attach too much meaning the the brackets (no matter what type of brackets we use)?
09:28 rns Jeffrey's current thinking — http://irclog.perlgeek.de/marpa/2014-11-19#i_9685328 — seems to be () for non-hiding, [] for hiding and <[ ]> do charclasses, the two latter stolen from Perl6.
09:30 rns re http://irclog.perlgeek.de/marpa/2014-11-19#i_9685791 — the alternative is to stack up post- or prefix modifiers or nest ()'s that doesn't feel right to me now.
09:31 rns All in all,
09:31 rns A ::= B [C D E]* F
09:32 rns meaning 'hide (C D E)*' looke likeable to me.
09:33 rns s/looke/looks/.
09:58 ronsavage Instead of A ::= B [C D E]* F to hide '[C D E]*' how about forcing the user to name the token(s) to be hidden? Then use normal ::= and ~ rules to define them. But provide a new special token like :ignore modelled on :lexeme.
10:01 rns Well, SLIF has :discard for lexemes, which in principle can be extended to G1 rules. Another possibility is a rule-hiding adverb, e.g. lhs ::= rhs hide => 1.
10:13 ronsavage Yes. I almost wrote :lexeme = XXX ignore => 1
10:13 ronsavage So something like yours or mine might work.
10:21 koo5 joined #marpa
10:50 ronsavage Bed time
10:52 rns Good night.
11:22 jluis joined #marpa
11:35 flaviu joined #marpa
12:15 Spads joined #marpa
13:20 koo5 joined #marpa
13:33 daxim by default, nearley just returns nested arrays from the parser.  I now have a piece of code that adds the equivalent functionality of marpa::r2's feature where results get blessed into classes.
13:46 lwa joined #marpa
14:23 jluis joined #marpa
19:24 Spads joined #marpa
20:38 flaviu joined #marpa
21:29 jeffreykegler joined #marpa
21:31 jeffreykegler ronsavage: re http://irclog.perlgeek.de/marpa/2014-11-19#i_9686228 -- sometimes you want to hide a symbol in one place and not another.  Also, you very often want to hide literals, which don't have names, so doing it by name could only be a supplement.
21:33 jeffreykegler Also, discarding is a bit different from hiding.  The G1 parser never sees discarded stuff, they're like invisible glue between the symbols in rules.  Hidden symbols are visible to the G1 grammar, and are used in the parse, they are hidden to the semantics.
21:34 jeffreykegler rns: Why don't we go ahead with the Perl 6 approach then.  [] for hidden grouping, () for un-hidden grouping, and <[...]> for character classes.
21:35 jeffreykegler jluis: looking forwarding to listening to your talk.  When do you think it will be online?
21:37 ronsavage rns: I do realize putting hide => 1 on ::= rules makes more sense than on ~ rules.
21:44 ronsavage jeffreykegler: I take as an implication of your comment http://irclog.perlgeek.de/marpa/2014-11-19#i_9689879 that if someone wants to discard a symbol in some contexts and just hide it (or even preserve) in others, they cannot do that by giving that symbol 2 names with different options. So - if the symbol occurs in a context covered by 1 rule (using the symbol's name with the 'hidden' option) then Marpa can't hide it, and if the symbol occur
21:47 jeffreykegler No I didn't mean to imply that.  Renaming would work and adding a rule would be a workaround for literals.  But using a hide adverb is a more heavyweight approach then the SLIF's parens or the Perl 6-ish one.
21:48 jeffreykegler Maybe the more heavyweight approach is something folks will want, but I can't imagine they won't also want the brackets and it's just something I think of as a priority at the moment ....
21:50 jeffreykegler folks will want a symbol-hiding syntax, one of them will be the lightweight one, and that's all we need at the moment ...
21:50 jeffreykegler bearing in mind I'll have to actually implement all this before Kollos is ready for the light of day.
21:51 jeffreykegler (The grammar rewriting and evaluation code are two pieces of the project I don't think I'll be able to delegate.)
22:24 spookah joined #marpa
22:24 spookah left #marpa
23:48 ronsavage joined #marpa
23:49 ronsavage No problem. I think it's best to throw ideas around at this stage, and provoke people to think about the drawbacks as well as the advantages.

| Channels | #marpa index | Today | | Search | Google Search | Plain-Text | summary