Perl 6 - the future is here, just unevenly distributed

IRC log for #perl6, 2007-02-15

Perl 6 | Reference Documentation | Rakudo

| Channels | #perl6 index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:07 IllvilJa joined perl6
00:42 [M]erk What does the darker green mean vs. lighter green in the smokes summaries?
00:42 shay joined perl6
00:46 weinig is now known as weinig|away
00:52 sunnavy joined perl6
01:03 TimToady_ joined perl6
01:13 svnbot6 r15266 | lwall++ | infix and postfix now more predictive
01:28 TimToady_ is now known as TimToady
01:29 gaal joined perl6
01:29 TimToady [M]erk: darker green is actually a Todo
01:30 TimToady ought to be purple or some such
01:30 jisom_ joined perl6
01:39 bonesss joined perl6
01:44 weinig|away is now known as weinig
01:49 cmarcelo joined perl6
01:55 audreyt Grrrr: congratulations on DBIx::Perlish. It's simply brilliant. :-)
01:57 cmarcelo Trac is on feather? if so, where are its config files?
01:57 cmarcelo (moose)
01:58 audreyt cmarcelo: /data/svn/trac/
01:58 wolverian it's also very scary. :)
01:58 * cmarcelo feels that pugs has too many sites =|
01:59 cmarcelo i'll try to connect them to each other at least...
01:59 audreyt *nod*
01:59 audreyt you're now TRAC_ADMIN
01:59 audreyt feel free to hand out admin bits as needed (from the Admin/permissions page)
01:59 cmarcelo audreyt: (feeling better?)
02:00 cmarcelo k
02:00 audreyt no, not really, can't focus for more than 15min at a time
02:00 audreyt highly annoying
02:00 cmarcelo and the "saving throw"-thing ?
02:00 audreyt it looks like I passed
02:01 audreyt a small chance I can leave hospital tomorrow
02:01 audreyt otherwise I'll stay for another week or so
02:01 audreyt but either way this will pass
02:01 audreyt and it looks like no complication will follow
02:01 cmarcelo @tell putter Do you have a blog bit? What do you think about posting on project status, etc?
02:01 lambdabot Consider it noted.
02:01 audreyt I guess I should be grateful :)
02:02 audreyt but it's still highly annoying :)
02:02 cmarcelo audreyt: great news :) [except for annoyances now/next week]
02:04 justatheory joined perl6
02:04 audreyt yeah :) *faints some more*
02:05 araujo hospital?
02:06 cmarcelo the family of sites is: pugscode.org, feather, dev.pugscode.org, rakudo-wiki, perl.org/p6... (shout if I missed something)
02:07 cmarcelo blog.p.o and irc.p.o too
02:09 audreyt spec.pugscode.org
02:09 audreyt irc.pugscode.org
02:09 audreyt run.pugscode.org
02:10 audreyt invite.pugscode.org
02:10 audreyt smoke.pugscode.org
02:10 audreyt that's it I guess
02:12 cmarcelo rakudo is "Perl 6" oriented and dev wiki is "Pugs" oriented?
02:12 audreyt or rather, rakudo is user-facing
02:12 audreyt while dev I hope is dev-facing
02:12 audreyt i.e. it more closely integrates svn/tickets/irc
02:13 audreyt while rakudo is more about presenting userland info
02:13 cmarcelo perl.org/perl6 is parrot-oriented?
02:14 audreyt it is history-oriented :)
02:14 cmarcelo k
02:19 dmq joined perl6
02:21 cmarcelo Pugs development takes place on the perl6-compiler mailing list.
02:21 cmarcelo :o)
02:21 cmarcelo (quotation from pugscode.org)
02:22 TimToady we just fixed that on the wiki
02:24 cmarcelo TimToady: by fixed you mean? (which wiki? rakudo?)
02:27 TimToady http://dev.pugscode.org/wiki/AboutPugs
02:27 lambdabot Title: AboutPugs - Pugs - Trac
02:28 cmarcelo tks. I'll borrow the fix then..
02:28 TimToady mostly just delete that paragraph...
02:29 cmarcelo hmm, but Pugs::Doc::Hack points to old (but beautiful) CPAN version...
02:31 cmarcelo I think it's better to make "How to get involved" point to dev.pugscode.org
02:55 [M]erk Which is the pugs mailing list then? It doesn't look like perl6-compiler is all that active. And perl6-internals seems to be mostly parrot, right?
02:57 cmarcelo [M]erk: pugs discussions are mainly at this IRC channel. perl6-compiler is seldom used...
03:00 svnbot6 r15267 | cmarcelo++ | * feather index: add links to other sister sites, remove kwiki links.
03:00 svnbot6 r15267 | cmarcelo++ | * pugscode.org: add more links, cleanup a bit.
03:00 svnbot6 r15267 | cmarcelo++ | * Pugs/Doc/Hack.pod: fix some links.
03:00 [M]erk PUGS - the project that email is too asynchronous for! PUGS - the project that is living in the now! PUGS - the project that is...
03:01 cmarcelo [M]erk: but we have a Trac system now => dev.pugscode.org ;)
03:02 cmarcelo (http://dev.pugscode.org/changeset/15267 => peer-review welcome)
03:03 lambdabot Title: Changeset 15267 - Pugs - Trac
03:03 * [M]erk gasps.
03:03 [M]erk You know that's a...
03:03 * [M]erk whispers, "Python project" :p
03:04 cmarcelo s/Trac/development wiki/
03:04 cmarcelo =P
03:06 [M]erk Can someone give me svn commit privileges?
03:06 SamB [M]erk: shh
03:06 SamB they'll hear you!
03:06 SamB and give you a commit bit!
03:07 lumi They?
03:09 lumi [M]erk: Got an email address? You can /msg me
03:12 svnbot6 r15268 | cmarcelo++ | * pugscode.org: fix a linebreak.
03:14 lumi [M]erk: Commit bit sent, welcome to Pugs!
03:14 [M]erk Thanks.
03:14 lumi [M]erk: You can test commit by adding your name to AUTHORS
03:15 lumi It's a tradition, or an old charter, or something :)
03:23 SamB their ought to be a PURPORTED-AUTHORS file for that ;-P
04:45 ilogger2 joined perl6
05:08 shay joined perl6
06:02 autark_ joined perl6
06:09 devogon joined perl6
06:13 BooK joined perl6
06:24 amnesiac joined perl6
06:51 sunnavy joined perl6
06:51 jamhed joined perl6
07:22 svnbot6 r15269 | lwall++ | Constraints on which operators metaoperators can metaoperate on.
07:26 iblechbot joined perl6
07:28 marmic joined perl6
07:45 kanru joined perl6
07:54 VanilleBert joined perl6
07:58 Grrrr audreyt: thanks  :-)
08:19 keigo joined perl6
08:49 kanru joined perl6
08:56 baest joined perl6
08:57 baest_ joined perl6
08:58 nekokak joined perl6
09:13 VanilleBert left perl6
09:21 UWC joined perl6
09:25 andara joined perl6
09:36 lumi_ joined perl6
09:40 iblechbot joined perl6
09:57 dduncan So if Grrrr is indeed Anton Berezin ... I like the idea of what I saw in DBIx::Perlish ... and am hoping that you may be able to do something similar for my non-SQL database, currently called QDRDBMS, after it is released.
09:58 dduncan For simplicity, my own DBMS takes certain Perl objects, holding an AST, as input, but it might be nice to have an even more Perlish interface as an optional extension, such as like you did with DBIx::Perlish.
09:59 dduncan Talk more later ...
09:59 pfarmer joined perl6
10:06 elmex joined perl6
10:31 Grrrr dduncan: I don't see a problem;  parsing optree is relatively easy, thanks to B.pm;  the difficult part is to come up with a sensible subset of Perl5 syntax to support
10:38 dduncan fortunately, the language of my DBMS is a lot easier to map Perl to than SQL is, given partly that its design is more Perlish
10:38 dduncan or should I say, more like a normal programming language
10:42 lichtkind joined perl6
10:47 ruoso joined perl6
11:06 zgh_ joined perl6
11:47 Aankhen`` Wow.  Just built Pugs again after a year (I think!), and `nmake fast` really is *fast*.
11:48 Aankhen`` I suppose it could also be the completely different system. :-P
12:05 koye joined perl6
12:06 ruoso @seen fglock
12:06 lambdabot I saw fglock leaving #perl6 5d 17h 54m 48s ago, and .
12:06 ruoso hmm... that's unusual...
12:07 chris2 joined perl6
12:08 ofer1 joined perl6
12:18 gaal joined perl6
12:23 TimToady joined perl6
12:25 rfordinal joined perl6
12:34 lichtkind_ joined perl6
13:06 baest_ ruoso: not sure, but I think he has a vacation
13:14 buetow joined perl6
13:21 Limbic_Region joined perl6
13:32 baest_ is now known as baest
13:42 andara joined perl6
13:59 rfordinal_ joined perl6
14:00 bonesss joined perl6
14:14 pmurias joined perl6
14:14 thepler joined perl6
14:14 pmurias hi
14:15 moritz hi ;)
14:16 pmurias in the mmd algorith is the type narrownes dependent on the paramater under consideration
14:16 pmurias ?
14:17 pmurias mortiz: is saying "hi" before asking a question silly?
14:18 moritz pmurias: ni, it's not ;)
14:24 * pmurias is thinking how to efficiently implement mmd
14:24 bonesss is now known as bones`eat
14:29 diakopter joined perl6
14:36 rfordinal_ is now known as rfordinal
14:40 pmurias do other languages with MMD allow you to specifiy the importance of parameters with semi-colons? (not nessesarly with the same syntax)
14:40 pmurias ?
14:42 rindolf joined perl6
14:54 amnesiac joined perl6
14:58 vel joined perl6
15:00 Coke_ I don't think parrot works that way, fwiw.
15:12 iblechbot joined perl6
15:20 [particle] i know some languages dispatch based on first one or two params only, but none that use a qualifier to denote dispatch semantics
15:20 [particle] however, my knowledge of languages with polymetric polymorphism (aka mmd) semantics is incomplete
15:22 pmurias thanks
15:23 sunnavy joined perl6
15:24 bones`eat is now known as bonesss
15:30 GeJ joined perl6
15:37 ofer1 joined perl6
15:45 andara joined perl6
17:11 rfordinal_ joined perl6
17:17 cjeris joined perl6
17:23 buetow joined perl6
17:46 andara left perl6
17:48 dmq joined perl6
17:48 svnbot6 r15270 | lwall++ | Constraint checks on parameter zones
17:48 svnbot6 r15270 | lwall++ | Constraint checks on reducable infixes
17:48 svnbot6 r15270 | lwall++ | Random cleanup
17:49 dmq just thought id mention that abigail has successfully converted the BNF for email addresses to a perl5.10 recursive regex.
17:50 pasteling "dmq" at 84.58.61.90 pasted "rfc compliant regex parser as a perl 5.10 recursive regex" (69 lines, 2.6K) at http://sial.org/pbot/23004
17:51 broquaint Brilliant.
17:52 broquaint That's some nice work you've done on the regex engine there, dmq.
17:52 dmq he said on #p5p he will look into writing a BNF to regex converter.
17:53 dmq glad you like it broquaint.
17:53 dmq btw, long time no chat. hope you are well
17:54 broquaint I'm well, yourself?
17:54 dmq not too bad.
17:56 broquaint All is well :)
17:58 xinming joined perl6
18:01 [particle] great example, dmq
18:01 [particle] i'd like to see a p6regex -> p5regex converter, and attribute handlers as the syntax marker
18:02 dmq yeah, it is cool. Abigail has been doing some testing of the new features.
18:02 [particle] he's certainly qualified :)
18:02 dmq heh
18:02 dmq did i say testing?
18:02 dmq torturing.
18:02 [particle] abigail++
18:03 dmq anyway, hopefulyl ive put most of what is required to write such a converter in place.
18:03 dmq assuming you ignore code stuff in p6.
18:04 [particle] fglock will be most happy, i imagine
18:04 dmq heh.
18:04 dmq until he finds the stuff that p5 does that p6 doesnt ;-)
18:05 dmq (?|...) comes to mind.
18:06 dmq although maybe p6 does already have something like that. (it makes capture buffers in different alternations share the same indexes)
18:06 [particle] i don't see that construct in perlre. ew to 5.10?
18:06 [particle] *new
18:07 dmq yes. a couple of weeks old i guess.
18:07 [particle] ah
18:07 dmq (?|..(foo)..|..(bar)..) both capture into the same buffer.
18:08 dmq H. Merijn Brands idea.
18:08 [particle] $<buffer>:=[(foo)|(bar)] i imagine
18:08 araujo joined perl6
18:08 dmq heh. figured p6 would cover it somehow.
18:10 justatheory joined perl6
18:12 gilimanjaro joined perl6
18:17 apostols joined perl6
18:25 gilimanjaro joined perl6
18:30 Coke_ f
18:30 Coke_ (oops)
18:36 TimToady dmq: (?|...) is the standard behavior for alternations under P6
18:37 TimToady S05: The index of a given subpattern can always be statically determined, but
18:37 TimToady is not necessarily unique nor always monotonic. The numbering of subpatterns
18:37 TimToady restarts in each lexical scope (either a regex, a subpattern, or the
18:37 TimToady branch of an alternation).
18:40 wolverian argh, the lack of covariance in java is making me insane. can someone please just destroy this horrid crap.
18:40 wolverian thanks, now I feel better.
18:44 Patterner Wait for Java 8
18:51 DebolazX joined perl6
18:53 wolverian I particularly like how it does support return type covariance, but not parameter covariance, making the other practically useless (for my purposes, anyway)
18:54 wolverian the return type covariance needs to be explicit, too. sigh.
18:54 apostols left perl6
18:59 drupek12167 joined perl6
19:03 UWC joined perl6
19:05 dmq timtoady: oh goodie, then fglock WILL be happy about (?|...)
19:06 dmq what happens when there is a different number in one of the alternations?
19:06 dmq (?|..(foo)...(foo)...|..(bar)..)(baz)
19:06 dmq what number will the baz buffer have?
19:06 dmq i made it be $3
19:06 Coke_ java 8 or Java 1.8? =-)
19:07 * Coke_ suggests being all hip and having Perl 6 actually be perl 5.24
19:07 dmq btw, sorry for the lag timtoady.
19:15 cddar joined perl6
19:15 Caelum joined perl6
19:16 wilx` joined perl6
19:37 bernhard joined perl6
19:41 larsen_ joined perl6
19:48 wilx` is now known as wilx
19:54 rindolf joined perl6
19:54 rindolf Hi all.
19:56 moritz re rindolf ;)
19:57 rindolf Hi moritz
19:57 rindolf moritz: what's up?
19:59 moritz rindolf: I'm fine, just had a bunch of pancakes as supper ;))
19:59 * moritz feels fat ;)
19:59 moritz rindolf: but regarding perl6/pugs: not much :(
20:00 rindolf moritz: do you feel fat or do you feel full?
20:01 moritz rindolf: rather full than fat ;)
20:02 rindolf moritz: OK.
20:02 UWC joined perl6
20:04 moritz and I'm trying to do some web apps with catalyst...
20:04 moritz it's very confusing, too many different files
20:07 specbot6 r13587 | larry++ | Split statement_modifier category in two.
20:07 specbot6 r13587 | larry++ | List comprehensions can now be done with statement modifiers.
20:07 specbot6 r13587 | larry++ | Multiple dispatch now explained in terms of topological sort.
20:07 specbot6 r13587 | larry++ | Multiple dispatch with single semicolons clarified, maybe.  However, multis
20:07 specbot6 r13587 | larry++ |   with single semicolon are likely just a reserved syntax in 6.0.0.
20:07 TimToady dmq: no, it's actually $1 in P6, short for $/[1]
20:07 TimToady the others would be $/[0][0]
20:07 TimToady $0[0] for short
20:08 TimToady maybe $/[0;0] works too
20:09 stevan_ joined perl6
20:10 dmq ah right
20:10 rindolf TimToady: are you larry in the previous commit?
20:10 rindolf TimToady: or are you lwall?
20:11 dmq is that an xor or an or or?
20:11 TimToady I'm larry on perl.org and lwall on pugscode.org.
20:11 TimToady nobody's ever accused me of being consistently consistent
20:13 dmq er, actually, "ah right" was the wrong thing to say. better would have been "oh really. umm ok, i probably havent read something i should have"
20:13 dmq :-)
20:18 TimToady I hear S05 comes highly recommended.  :)
20:22 masak mm, @evens = ($_ * 2 if .odd for 0..100);
20:22 masak nifty
20:22 masak also quite readable
20:25 TimToady and just falls out of existing syntax, basically
20:25 TimToady course if you want to have multiple lists, you have to get fancier
20:26 rindolf masak: I get a syntax error in r15257
20:27 rindolf masak: this expression seems Pythonic.
20:27 dmq ok, ill read it up.
20:27 dmq i thought i already did. but obviously theres a lot of material.
20:28 dmq btw, do you have an suggestions for how to do char class set operations in perl5?
20:28 rindolf TimToady: did you invent TAP (that "^ok" "^not ok" syntax)?
20:28 dmq ive thought of introducing an "extended char class notation" as (?[....]....)
20:28 dmq like (?[a-z]-aeiou)
20:29 dmq but its kinda fugly.
20:30 Coke_ "The basis for the TAP format was created by Larry Wall in the original test script for Perl 1" - frmo http://search.cpan.org/~petdance/TAP-1.00/TAP.pm#AUTHORS
20:30 lambdabot Title: TAP - The Test Anything Protocal - search.cpan.org
20:30 rindolf Coke_: thanks.
20:30 rashakil_ joined perl6
20:35 TimToady dmq: if you want to do it more p6-like, you'd say something like (?+[a-z]-[aeiou])
20:36 TimToady which also allows for Unicode properties as names
20:36 mugwump http://git.catalyst.net.nz/gitweb2?p=perl.git;a=blob;h=11c48e;hb=840163;f=t/TEST  # first Harness :)
20:36 lambdabot Title: git.catalyst.net.nz Git - perl.git/blob - t/TEST, http://tinyurl.com/25bk2x
20:36 rindolf TimToady: wanna see some Lisp code I wrote in vim?
20:36 [particle] joined perl6
20:37 nipra joined perl6
20:38 TimToady I am not so in love with either Lisp or vim that I would crushed to miss it.  :)
20:38 TimToady what does it do?
20:38 dmq (?+...) sounds interesting.
20:39 dmq not sure if its been grabbed for something tho.
20:39 rindolf TimToady: calculates the Graham function.
20:39 rindolf TimToady: it's a port of my code to the advanced Perl Quiz-of-the-Week No. 8, IIRC.
20:39 rindolf TimToady: don't you use vi?
20:40 dmq dang, (?+...) is taken.
20:40 dmq relative recursion.
20:40 TimToady should have read S05 first...  :)
20:40 dmq that sucks.
20:40 dmq i did.
20:40 dmq theres a LOT of stuff in there.
20:40 rindolf http://opensvn.csie.org/shlomif/programs/lisp/trunk/graham-function/ just in case.
20:41 lambdabot Title: Revision 975: /programs/lisp/trunk/graham-function, http://tinyurl.com/2sl8uf
20:41 dmq and in the regex engine itself. keeping both in mind at the same time is kinda hard. :-)
20:41 TimToady yes, but you should realize that <...> is the exact analog of (?...)
20:41 dmq yes, i like that.
20:41 shay hello folks
20:42 rindolf Hi shay
20:42 TimToady howdy
20:42 shay hi shlomif
20:42 shay larry
20:42 dmq hrm, maybe we should change relative recursion so its (?&+1) and (?&-1)
20:42 rindolf shay: what's up?
20:42 [particle] too late to change (?+...) to (?@...) or something else?
20:42 shay rindolf, one week left :)
20:42 rindolf TimToady: I also have a Perl 6 version.
20:42 rindolf shay: to what?
20:42 shay to get released
20:42 shay chafshash
20:42 dmq its +/- for a specific reason.
20:42 rindolf shay: from the Army?
20:42 shay yeah
20:42 rindolf shay: is it the end of your service?
20:43 shay yes
20:43 rindolf shay: or just a vacation?
20:43 rindolf shay: nice.
20:43 rindolf shay: congratulations.
20:43 shay end of the never-ending service
20:43 shay rindolf, thanks
20:43 shay I'll finally have time to work on the sparc port
20:43 rindolf shay: SPARC port to what?
20:44 rindolf shay: SPARC port of what?
20:44 shay perl6
20:44 shay pugs/parrot
20:44 rindolf shay: you mean it doesn't run on SPARC atm?
20:45 rindolf shay: doesn't ghc run on SPARC?
20:45 shay rindolf, it *should* run, but I wan to officially maintain it
20:45 rindolf shay: ah OK.
20:45 shay rindolf, test every release, make some test near-daily
20:45 rindolf shay: do you have a SPARC at home?
20:46 shay rindolf, yeah, a Sun Ultra10
20:46 * moritz is jealous ;)
20:46 TimToady dmq: I would suggest that character class sets are probably more important and want a shorter Huffman coding
20:46 shay rindolf, yba gave it to me :)
20:46 rindolf shay: Yonathan Ben-Avraham?
20:46 shay hello moritz :)
20:46 shay rindolf, yes
20:46 rindolf shay: OK.
20:46 rindolf shay: nice.
20:47 shay he sent a mail to linux-il asking if someone want it, I reply'd
20:47 shay then he asked me why should he give it to *me*
20:47 [particle] shay: we'll be happy to add you as a parrot porter for sparc
20:47 shay I told him that I want to make some code portability test and learn the architechture in general
20:47 shay next mail was: "when are you able to pick it up?"
20:47 dmq timtoady: i think i agree.
20:47 dmq no, i agree.
20:48 shay [particle], give me a week
20:48 shay [particle], I need to get home, I'm in the army now
20:48 [particle] yes, i see that
20:48 shay [particle], is someone working on that port atm?
20:48 [particle] no
20:48 shay great
20:48 shay I'm doing my work on NetBSD
20:48 [particle] be great to have a smoker setup
20:49 shay will have
20:49 [particle] shay++
20:49 [particle] join us on #parrot (irc.perl.org) when you're ready
20:49 dmq its just a pain to do.
20:49 czth__ joined perl6
20:49 dmq as im sure you recall from regcomp.c :-)
20:49 rindolf shay: what do you do at the IDF?
20:50 dmq . o O ( If he told you he would have to kill you )
20:50 shay rindolf, fields intelligence combatant
20:50 rindolf shay: I see.
20:50 dmq hah!
20:50 shay dmq, kind of :)
20:57 pbuetow joined perl6
20:59 lichtkind_ vorgive me for repeating but what is the current main topic of change?
21:01 TimToady do you mean, what are we working on the hardest right now?
21:02 TimToady I'm mostly working on http://svn.pugscode.org/pugs/src/perl6/Perl-6.0.0-STD.pm
21:04 TimToady other folks are of course working on other things
21:05 TimToady or are you referring more to design and spec changes?
21:06 TimToady or to the topic of the channel?
21:06 rindolf TimToady: aren't you missing a =cut there?
21:06 TimToady =cut is gone in Perl 6
21:06 rindolf TimToady: oh.
21:07 TimToady (though I suspect pugs still recognizes it)
21:07 rindolf TimToady: what text editor are you using?
21:07 TimToady but =begin/=end are always supposed to nest right and return you to whatever context was on the outside.
21:08 TimToady I use vim, but I think some of the syntax that has developed over the years is extremely crufty.
21:08 rindolf TimToady: what syntax?
21:08 TimToady regex, for one
21:09 TimToady but then, I think that about Perl 5 too... :)
21:09 TimToady it would be fun to rewrite vim with Perl 6 as its fundamental syntax.
21:10 rindolf p6re
21:10 TimToady currently everything in vim is very ad hoc
21:10 moritz sadly, yes
21:11 moritz if I imagine the combined power of perl and vim...
21:11 moritz (and I don't mean the linked perl interepreter in vim, that's not _so_ powerfull)
21:11 moritz we could have world dominance ;)
21:12 TimToady syntax hilighting with real grammars, for instance
21:13 lichtkind_ TimToady sorry girlfriend aked something, yes i mean spec changes
21:13 lichtkind_ i think thats enough for the first
21:14 TimToady most of the spec changes are from trying to write the grammar
21:14 TimToady but the mmd algorithm has also been on my mind for six months or so
21:14 lichtkind_ TimToady as you may knoe im writing editor in perl :)
21:15 lichtkind_ but perl6 is at very bottom on todo :)
21:15 lichtkind_ but its definitly a dream
21:15 tene Would be nice to have an editor with Perl6 integrated like lisp is in emacs
21:16 lichtkind_ that was one of the reasons i started the project
21:16 lichtkind_ but it was 2002, never heard of perl6
21:16 lichtkind_ is now known as lichtkind
21:17 dduncan joined perl6
21:17 lichtkind of course is perl6 much cooler
21:17 lichtkind but i use it as my primary editor
21:17 dduncan left perl6
21:18 lichtkind and always try be rock stable
21:18 moritz lichtkind: does it have vi-like modes and key bindings?
21:18 lichtkind not yet
21:19 lichtkind i personly think vi modes suck badly but for the advantages they bring io wanted to introduce something similar
21:19 lichtkind but currently we have main topic CPANification
21:19 arcady joined perl6
21:20 dduncan joined perl6
21:20 lichtkind to have short command for aditing is very cool
21:20 lichtkind but i like it more distinkt visually than different modes
21:21 lichtkind i guess if your interested we can discuss that in another channel
21:21 moritz yeah, it's not the best idea to start an editor flame war ;)
21:22 lichtkind nop because i studied editors a lot and can see good things in all but in the end i plan to make a better one than all together :)
21:23 lichtkind similiar larrys standpoint to languages
21:23 lichtkind :)
21:23 moritz lichtkind: when you're finished I promise I'll try it ;)
21:24 lichtkind you know your never finished :)
21:24 dmq everybody wants to make a better editor.
21:24 dmq :-)
21:24 * moritz not ;)
21:24 moritz lichtkind: s/finished/published Version 3.0/ ;)
21:25 lichtkind dmq no my involment was an exident :)
21:26 lichtkind moritz thats bad because currently i have 0.3.3.17 and 1.0 contains most i can think of today :)
21:26 moritz lichtkind: that's no problem, you'll get more ideas ;)
21:27 moritz lichtkind: no, honestly, I'll try it before, "when I have time[tm]" ;)
21:27 lichtkind yes but gratest problem is that it cant be one man show forever
21:27 lichtkind is this another word for never :)
21:28 Juerd lichtkind: Hi
21:28 moritz lichtkind: ok, tell me an URL ;)
21:28 lichtkind proton-ce.sf.net
21:28 lichtkind but i highly recommend a nightly build
21:28 lichtkind from http://web52.xeon225.server4you.de/
21:29 moritz I highly recommend to offer debian packages ;)
21:29 lichtkind hello Juerd glad to see you
21:29 Juerd lichtkind: Is it acceptable for you to leave halfway day 3?
21:29 Juerd Because otherwise I won't be home in time
21:29 Aankhen`` joined perl6
21:30 lichtkind moritz like i said cpanification is on the way, there is a script to make debian packages of it
21:30 lichtkind moritz i suspect you have linux
21:30 lichtkind therefore a little bit work :)
21:31 lichtkind Juerd ok  i would like to stay a bit so i see you as my backup plan :), but i say in time when i found something else
21:33 Juerd lichtkind: Okay
21:33 Juerd You can always drive along on the way there, of course.
21:33 Juerd That'll be the 20th
21:33 Psyche^ joined perl6
21:36 lichtkind Juerd thanks thats great but from ffm to munich i have aleady a seat in car of a friend, after munich he goes to parents in austria thatswhy i need another way back :)
21:37 dmq lichkind you are in ffm?
21:38 dmq am i going to be driving to munich with you i wonder?
21:38 dmq :-)
21:38 lichtkind dmq no god beware, but a girl friend of mine :)
21:39 lichtkind ah and when you drive back?
21:39 dmq ah ffm aint soooo bad.
21:39 dmq im not driving back.
21:39 dmq train.
21:39 Juerd lichtkind: I see
21:39 lichtkind dmq and which time?
21:39 dmq back?
21:39 Juerd dmq: Destination?
21:39 dmq not sure.
21:39 dmq ah, GPW?
21:39 dmq oder, DPW
21:40 Juerd dmq: And the other way?
21:40 dmq oh, back to ffm
21:40 dmq i live here
21:40 Juerd DPW is confusing: could be Dutch or Deutsche ;)
21:40 dmq there
21:40 Juerd What's ffm? :P
21:40 moritz Juerd: Frankfurt/Main
21:40 lichtkind frankfurt
21:40 Juerd I see
21:41 dmq but the weird thing is im driving to munich with someone who afterwards is going to austria to see his parents.
21:41 Juerd dmq: I'm almost driving through Frankfurt. If you need a ride...
21:41 Juerd The thing is that I'm leaving during day 3. Probably missing the last 3 or 4 talks.
21:41 lichtkind dmq i come from east and from villige thatswhy its hard for me to live there more than a week
21:41 dmq ill be going with corion from pm. so i wouldnt worry about me, its very kind to offer tho.
21:42 lichtkind Juerd the second last is interesting to me
21:42 Juerd lichtkind: I'm quite interested too, but need to make it back home in time
21:42 lichtkind of course
21:42 Juerd It's 10 hours from Munich to Dordrecht
21:43 Juerd And I need to be awake the day after
21:43 lichtkind :)
21:43 Juerd So I'm planning on arriving approx 1:00, saturday, then sleeping 7 hours, and waking up at 8 am.
21:43 Juerd My office will be rewired that day.
21:44 Juerd Getting 3 * 35 A
21:44 Juerd Which is a nice upgrade from 2 * 16 A, but requires some additional changes.
21:44 Juerd (To use it well.)
21:44 lichtkind dmq when you go back at saturday we can join a weekend ticket
21:44 dmq no im going back friday afaik.
21:44 dmq unfortunately. sorry mate.
21:45 * Juerd wonders if there's a GPW irc channel
21:45 dmq actually, i take it back. i have no idea how im getting home.
21:46 dmq we can discuss it there im sure.
21:46 dmq all i know is i love my bahcard50!
21:46 dmq bahncard-50
21:47 Juerd dmq: Well, you could drive along with me :)
21:47 Juerd I'd like a passenger to keep me awake :P
21:47 Juerd (And maybe share fuel costs, but that's secondary, if not tertiary)
21:48 dmq sounds interesting, but as im speaking id kinda feel bad bailing early.
21:48 dmq althought ill think about it.
21:48 Juerd Just don't leave before your own talk ;)
21:48 dmq I can definitely sympathise with the desire to have company in the car tho.
21:49 Psyche^ is now known as Patterner
21:49 Juerd I've done Berlin -> Dordrecht and Chemnitz -> Dordrecht alone before, and while it's doable, I didn't like that I was by myself.
21:50 lichtkind dmq i thought you go train?
21:54 dmq i go by car, return by train.
21:54 dmq i have a feeling the same car you are going in.
21:54 dmq :-)
21:54 dmq if you are going with strat.
21:55 Juerd Grin
21:55 lichtkind äh yes
21:56 Juerd ä :)
21:56 iblechbot joined perl6
22:02 svnbot6 r15271 | lwall++ | Factored out rule names from all #= comments; preprocessor now expected to
22:02 svnbot6 r15271 | lwall++ | recognize /^[rule|token|regex] <ident>/ as implicit start of {*} identity.
22:02 svnbot6 r15271 | lwall++ | The #= comments now only add to that base identity.
22:14 dmq it would be cool for referencing if synopsis 5 had anchors on the bulltet points.
22:15 dmq :-)
22:15 tene dmq: are you asking for a commit bit?
22:15 Juerd dmq: Patch^WCommits welcome ;)
22:15 tene That kind of talk around here will get you a commit bit if you're not careful.
22:16 dmq is it html or is it generated from pod?
22:16 Juerd tene: I expect dmq to already have one. Or two. Or three. :)
22:16 Juerd dmq: The latter
22:16 Juerd dmq: But that shouldn't stop you, should it? :)
22:16 dmq heh
22:16 Juerd http://feather.perl6.nl/syn/  # these we control ;)
22:16 lambdabot Title: Official Perl 6 Documentation
22:17 dmq I guess its a prioritization issue. Theres things very few people beside me can do, and theres other things that lots of folks can do. Which should i choose?
22:17 larsen_ joined perl6
22:17 Juerd dmq: The most -Ofun things.
22:18 dmq heh
22:18 dmq of the options most of them arent fun.
22:18 dmq :-)
22:19 dmq im currently trying to make unicode character classes in perl 5.10 use a sane data structure.
22:19 dmq i dont even use unicode damnit.
22:19 dmq :-)
22:19 Juerd Oh, but you do!
22:19 Juerd You're sending perfect utf8 sequences to irc ;)
22:19 dmq heh.
22:20 Juerd utf7 too, most of the time
22:20 Juerd Why do you not use unicode?
22:20 dmq because i dont need to.
22:21 dmq i was blissfully unaware of unicode until i started hacking the regex engine.
22:21 dmq what a schock. :-)
22:22 * [particle] doesn't use unicode--because it's <<<unamerican>> :)
22:22 shamu uit
22:22 moritz [particle]: actually that would be a good reason to use it ;)
22:22 shamu uit
22:23 shamu joined perl6
22:23 dmq its a kind of pet rant for me. IMO a big chunk of unicodes horribleness comes from trying to please everyone. Im not sure its the right approach.
22:24 moritz dmq: what would you say is the right approach?
22:24 shamu well, bear in mind that UTF-8 is unicode, and the first 7 bits match one-for-one with ASCII -- that's *Am'rken* Standard code for Information Interchange, podner
22:24 Juerd dmq: But do you never use non-ascii?
22:24 moritz dmq: utf-32 with just one char for each possible sign?
22:25 dmq im not sure what the right solution is.
22:25 SamB I think a better approach would be to try and please someone
22:25 dmq and in some ways yes, utf32 has a bunch of advantages that are relevent to me.
22:26 dmq but i think it comes down to what samb said somewhat. I mean unicode does everything, including dead languages.
22:26 shamu fyi, I've been trying to figure out what 'Unicode' actually is for a long time
22:26 * moritz would like to see a generic utf-2**n (for any integer n ;) *duck*
22:26 Juerd dmq: Does it matter that it includes dead languages?
22:26 shamu any recommendations?  The only intelligible one I ever found was http://www.joelonsoftware.com/articles/Unicode.html
22:26 lambdabot Title: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know A ...
22:26 SamB I wish it tried to do something right
22:27 Juerd shamu: That's a good one.
22:27 dmq my view is human communication evolves just like the rest of our cultural insitutions and we shouldnt waste computer cycles because somebody might want to use ancient greek.
22:27 Juerd shamu: See also wikipedia. And if you want to use it with Perl, perlunitut.
22:27 SamB rather than trying to do everything, except leaving it all half-done
22:27 lichtkind http://svn.pugscode.org/pugs/src/perl6/Perl-6.0.0-STD.pm is this perl6 metadata es part of the interpreter
22:27 Juerd dmq: Ancient Greek needs to be digitized too, if we are to preserve things.
22:27 dmq unfortunately im also a native english speaker, so people tend to think im being an english bigot when i say stuff like that. but i mean seriously.
22:28 dmq sure digitize it. just dont make every program everywhere pay the price.
22:28 Juerd dmq: And it's great that we can put it all in one encoding.
22:28 shamu well, computers are now multicore at 2Ghz plus -- what's the point, if not to have some of those cycles to waste on reading and publishing something the ancient greeks figured out a long time ago, so we can learn from their mistakes and hopefully culturally evolve
22:28 Juerd dmq: If only for web pages, it's useful.
22:28 SamB also, mathemeticians needed those letters anyway
22:28 dmq sure there are places where unicode is useful.
22:28 dmq like for a web browser.
22:28 dmq but for an os?
22:29 shamu also, isn't every ascii program automatically utf-8?
22:29 dmq or a programming language?
22:29 masak other recommendations: http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode
22:29 Juerd dmq: Well, programming languages deal with the web, too.
22:29 lambdabot Title: ongoing &#xb7; On the Goodness of Unicode
22:29 masak http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF
22:29 Juerd So if it's useful for the web, it's automatically useful for programming languages too.
22:29 lambdabot Title: ongoing &#xb7; Characters vs. Bytes
22:29 masak http://www.tbray.org/ongoing/When/200x/2006/10/22/Unicode-and-Ruby
22:29 lambdabot Title: ongoing &#xb7; Unicode and Ruby, http://tinyurl.com/yoz8sp
22:29 dmq shamu have you looked at what is required to read a utf8 stream even if it only ascii.
22:30 shamu well, if it is only ascii, can't you just say 'getchar(); if high bit set, abort reading stream'?
22:30 shamu I mean 7-bit ascii
22:30 dmq ascii is 7 bit
22:31 Juerd Very few texts that I encounter are pure ascii.
22:31 dmq utf8 is exactly equivelent to ascii for code points 0-127
22:31 Juerd Most of the time it's utf8, second place goes to latin1.
22:31 dmq juerd sure: but dont you think that a fixed wisth 16 bit encoding would be sufficient for all your needs?
22:31 Juerd But I deal with iso-8859-15 and -3, and windows-125x, and koi8-r, too.
22:32 Juerd dmq: I personally don't care if Perl *internally* uses 16 bit or not.
22:32 dmq i can see the utlity in unicode, but using it as a standard internal encoding doesnt make sense to me.
22:32 SamB you would prefer what?
22:32 Juerd dmq: But when outputting things, I like to use utf8, because then I can still use it on an old fashioned terminal.
22:32 dmq note the misspelled "fixed width"
22:32 SamB no standard internal encoding?
22:33 SamB all programs having to deal with things in unspecified encodings?
22:33 Juerd SamB: My personal preference for perl's internal string encoding is bitwise negated utf8.
22:33 dmq yes, use a kludge to work around a kludge.
22:33 SamB and files people send you often not working?
22:33 dmq im not saying there are easy solutions or that unicode isnt the least worst.
22:33 dmq but i dream of a better world :-)
22:34 SamB oh, by all means unicode is horrible
22:34 dmq SamB: I bet it is extremely rare to find a document that contains all 100k letters in it.
22:34 Juerd SamB: There must be some internal encoding, but as long as Perl knows which one it is, and how to convert it to the other encodings, all are fine with me.
22:34 SamB but is currently the least-horrible available thing of its kind
22:34 * Juerd really wants his ~utf8, but lacks C fu :(
22:35 Juerd (and tuits)
22:35 dmq anyway, im probably biased as with what ive been coding tends to require efficient random access to characters. Which most unicode encodings dont allow.
22:35 SamB ah.
22:35 SamB try UTF32.
22:36 [particle] juerd: why ~?
22:36 Juerd [particle]: So the mistake of forgetting to encode your output is clearly visible.
22:36 dmq yes, ill just recode perl5 to use utf32 internally. :-)
22:36 [particle] win32 encodes files in ucs2 iirc
22:36 Juerd [particle]: As would the mistake of not decoding.
22:37 [particle] juerd: should be really easy on parrot
22:37 allbery_b hm, iisn't ucs2 deprecated?
22:37 SamB Juerd: that is what typesystems are for
22:37 Juerd [particle]: If Perl 6 has strings the way I think they will be, that won't be necessary. :)
22:37 dmq before xp it did, xp and later it does utf-16
22:37 dmq http://en.wikipedia.org/wiki/UTF-16
22:37 Juerd SamB: Yes, but Perl 5 doesn't have them, and does need to support BOTH unicode and text.
22:37 allbery_b that makes more sense
22:37 Juerd s/text/binary/
22:37 gnuvince_ joined perl6
22:37 SamB isn't utf-16 only a little different from ucs2?
22:37 SamB in practice?
22:37 dmq yes
22:37 [particle] yes
22:38 dmq no surrogate pairs
22:38 SamB about as different as utf-8 and ascii?
22:38 Juerd No, much less different.
22:39 SamB except for the small detail of there not being terribly many characters in Unicode that don't fit in two bytes?
22:40 dmq no, ascii and utf8 are very different.
22:41 dmq oh well, ok, yes, ascii->utf8 fixed width/variable width, utf-16->ucs2 fixed width/variable width.
22:42 dmq whatever.
22:42 ProperNoun joined perl6
22:43 SamB also, both utf-8 and utf-16 use what were unasigned codepoints for their multi-word character codings, yes?
22:44 dmq if i get the question yes for utf8, pass on utf-16.
22:46 SamB I'm pretty sure that the codepoints corresponding to the words in surrogate pairs were previously unasigned
22:46 allbery_b actually I'd claim utf8 is not very different from ASCII, because ASCII only defines code points 0-127
22:48 diotalevi ASCII? No, I thought that defined all of 0 -> 255.
22:48 dduncan no, ASCII is 7-bit
22:48 Juerd ASCII is 7 bit. It cannot have anything > 127
22:48 Juerd Not without compression, at least :)
22:49 dduncan and UTF-8 is identical to ASCII for codepoints 0..127, afaik, which is part of its appeal
22:49 Juerd diotalevi: There are ascii-compatible encodings like iso-8859-1, cp437, etcetera, that have 255 characters.
22:49 dmq we are talking about the merits of an encoding tho.
22:49 diotalevi dduncan: er, minus some of the control character parts of ASCII. I thought that differed slightly.
22:49 dmq so the relevence of codepoint equivelency is kinda a seperate issue
22:50 Juerd diotalevi: No, 0..127 are fully equal in ascii and utf8
22:50 dduncan if you're going to use unicode, which I recommend to be the default, I would say that UTF-8 is the best default bet
22:50 dduncan its other advantages include being byte order independent
22:50 [particle] ascii is a codeset and an encoding, so it's hard to speak about clearly
22:50 [particle] s/codeset/charset/
22:50 dduncan and relatively compact
22:51 dmq and reading ascii is not identical to reading utf8 unless you know in advance that you are really dealing with ascii.
22:51 dduncan also, UTF-8 is encoded such that you can start in a text stream at any byte and you can easily tell where the character boundaries are
22:52 Juerd dmq: Can't you read in ascii, and upgrade to utf8 when you encounter the first high bit?
22:52 dmq so if you dont know, or you are dealing with characters outside of ascii you have to do the clumsy read and scan of utf8.
22:52 dduncan which helps reliability
22:52 dmq juerd: im kinda inclined to think that encoding should a problem the coder deals with. only they have the information to make the right decision.
22:53 diotalevi Say, 127 isn't defined in Unicode but is DEL in ASCII. Are you sure that one is the same?
22:53 dmq utf8 is /defined/ to be ascii for lowbit bytes (in a wellformed utf8 string)
22:53 [particle] transcoding is definitely a user issue. but support for major encodings should be supported in core ops/libraries
22:53 diotalevi er, wait. I was reading the wrong line.
22:53 sunnavy joined perl6
22:54 dmq particle: yes i agree pretty much.
22:55 dmq dduncan: that is true.
22:56 dmq about finding the boundaries from a given point. but finding boundaries doesnt replace the fact you cant do random access.
22:56 dduncan if you know a stream is UTF-8, then you can do random access
22:56 SamB utf-8 probably sucks as an in-memory representation
22:57 dmq dduncan: how do you reckon.
22:57 SamB but not so bad for an on-disk encoding for programs, usually...
22:57 Psyche^ joined perl6
22:57 dmq you need to a linear scan.
22:57 elmex joined perl6
22:57 dduncan the bit patterns of utf-8 characters are such that you can recognize just from looking at no more than 6 consecutive bytes where the character boundary is
22:57 dmq heh. i wonder, maybe some of those old algorithms for tape would be useful.
22:58 dmq dduncan: that means you scan.
22:58 dduncan but you don't scan from the start of the string, which is my point
22:58 dduncan a handful of bytes is nothing
22:58 dmq there is no way without scanning to say "jump to the 10th boundary from here"
22:58 allbery_b you can spot *a* character boundary but ==dmq
22:59 dduncan its when you have to start at the beginning of the string to know how to interpret the characters you get to correctly, which is the problem
23:00 allbery_b on the flip side, 32-bit chars are always fast to index but slow to do anything else with (see Haskell [Char])
23:00 dduncan while "go to 10th character" is needed for some apps, many apps don't require you to do that, such as things with data interchange or network operations
23:00 Juerd dduncan: "If you know a stream ..., random access ..." That's the problem: utf8 only makes sense as as stream. You need to scan it. Therefor, there's no sane way to do random access, unless you keep an offset map.
23:00 allbery_b (well, they also have indexing issues because it's [Char] instead of Array ... Char)
23:00 dduncan for the apps that need to do this, you can transcode it to UCS32 for internal use
23:01 Juerd (For *huge* data, it makes sense to keep an offset map of every 128th byte, or so)
23:01 dduncan er, UCS4
23:01 dduncan (ucs uses bytes, utf uses bits)
23:01 dmq huh and huh?
23:01 dduncan afaik
23:01 lichtkind night folks, i believe in you !
23:02 dduncan er, the numbers in the names of UCS count in bytes, in UTF, bits
23:02 dduncan that's what I meant to say
23:02 dmq i thought the ucs names were the old ones
23:02 dduncan they are
23:03 dduncan utf is more modern, and what I prefer
23:03 * Juerd hungry.
23:03 dmq right
23:03 * dduncan ditto
23:03 dmq i dont get the every 128 bytes comment exactly. i probably havent thought about it longer.
23:03 dmq long enough
23:04 sunnavy joined perl6
23:04 dduncan I don't know the significance of 128 bytes either
23:04 Psyche^ is now known as Patterner
23:05 [particle] i think he means something to mark the start of a grapheme
23:05 dmq ah i see.
23:05 [particle] so you can seek to that position and tell safely
23:05 dmq right. that makes sense.
23:06 dmq but then theres the overhead of doing that.
23:06 dmq sigh. it all sucks.
23:06 dmq :-)
23:06 dduncan so there's a marker for each 128 bytes that says what character number is there?
23:07 dduncan I think that makes sense
23:07 Juerd dmq: If you, during reading, scan everything and cache the character offset for every 128th byte (rounded up or down to full character boundaries), you can more efficiently locate character N, because you can start scanning at the closest checkpoint.
23:07 Juerd dmq: As said, this is only beneficial for *huge* data.
23:07 allbery_b yeh, so instead of counting from the start you can pick it up in the middle.  tradeoff between overhead of keeping a count and having to step
23:07 Juerd Like, entire books :)
23:08 dmq right right
23:08 Juerd (And even then, you should think twice before going through the trouble of implementing all this.)
23:08 dmq no no
23:08 dmq im still on character classes in unicode.
23:08 dmq no worries.
23:08 [particle] well, if it's static content... just create a lookup table
23:08 Juerd (After all, the (Christian) Bible, fits in 1.44 MB! :P)
23:08 allbery_b I'd actually say it's worthwhile if the string is >4k or so
23:08 Juerd allbery_b: 4 kB already?!
23:08 dmq i cant quite get invert(invert($class)) to work.
23:08 Juerd Nah, I don't think it will.
23:09 Juerd [particle]: The mapping I referred to *is* a lookup table :)
23:09 dmq i dont suppose anybody has the unicode book that covers inversion lists handy?
23:09 allbery_b of course, most strings are << that, so it's still not much of a win in practice
23:09 sunnavy joined perl6
23:09 [particle] juerd: sorry, i meant *store the lookup table
23:10 Juerd allbery_b: «? ;)
23:10 * [particle] is distracted by food
23:10 Juerd allbery_b: I think that with a 4 kB string, the overhead of keeping a mapping table is still too large to benefit from it.
23:10 allbery_b *you* try doing unicode through a vnc client on OSX sometime :>
23:22 TimToady dmq: don't get fixated on random access to strings.  it's only going to get less important with time.  And not even UTF-32 is a fixed width encoding of graphemes, which is what the user really wants to think in terms of anyway.
23:24 TimToady regexes don't really need random access, for instance.  nearly all the offsets are very small and relative to your current position.
23:24 TimToady the quest for a fixed unit of storage to represent characters is misguided in my opinion except as an optimization that is below the abstraction level of the programmer.
23:25 TimToady very few people complained that substr slowed down when we went with utf-8 in perl 5
23:27 Gothmog_ That's not necessary a good argument.
23:28 TimToady It's not necessarily a bad argument either.  :)
23:28 allbery_b it's a "good enough" argument.  which, given that you can't have perfection, is not a bad thing
23:28 TimToady the point is that substr and friends aren't all that useful once we start getting away from the punchcard metaphor of text.
23:29 TimToady nearly all the pattern matching done in Perl is done with regex,
23:29 Gothmog_ I think of UTF-8 vs. some fixed width encoding as a speed vs. memory trade-off.
23:29 TimToady and regex naturally finds boundaries without caring about large offsets
23:29 TimToady Gothmog_: fine, but that should be below where the typical user is thinking.
23:30 TimToady which is at the grapheme level, which corresponds to what the user thinks of as a "character".
23:30 Gothmog_ Hm right, but it might be important if some kind of string lookup is O(1) or O(n).
23:30 Gothmog_ Like that's why we use hashes and not array of pairs.
23:30 Gothmog_ s/array/$&s/
23:31 TimToady that's one of the things that a VM is pretty good at optimizing on the fly
23:31 TimToady but I am adamant on the subject that a string position in Perl 6 is *not* *not* *not* an integer.
23:32 Gothmog_ Hm.
23:32 TimToady it's one of my hot buttons, in fact
23:33 Gothmog_ What is it that a VM can optimize on the fly, and what do you think should a string position be, if not an int?
23:33 moritz so what is it? a pointer?
23:33 dmq not an integer?
23:33 TimToady absolutely not
23:33 dmq wider?
23:33 TimToady integers don't know their units
23:34 diotalevi . o O ( A marker? )
23:34 TimToady yes, basically a marker
23:34 dmq ah ok. a vector.
23:34 Gothmog_ So, you want to differ n bytes / n graphemes / n whatever?
23:34 dmq so it wont count code points?
23:34 TimToady if you force it to count in a particular unit, you must make sure it knows the correct units
23:35 TimToady dmq: by default, no
23:35 dmq interesting.
23:35 Gothmog_ And what happens if you don't enforce a particular unit?
23:35 TimToady the default in Perl 6 is graphemes, and has been from day one
23:35 Gothmog_ That seems to be sane.
23:35 moritz so is 1 grepheme = 1 code point is this context?
23:35 TimToady the default Unicode level is to count by graphemes
23:35 dmq i suppose its true.
23:36 TimToady a grapheme may be several code points
23:36 dmq you dont necessarily need to store a full map.
23:36 TimToady a base character plus its combining characters, basically
23:37 TimToady that is also why there is no .length method in Perl 6
23:37 Gothmog_ But if you access the nth grapheme, n is an int, or not?
23:37 TimToady it will grudgingly translate n to a string position, and then try to maintain the abstraction from then on.
23:37 dmq ive been thinking of how to store a trail of positons reached via accepting states from a DFA so that it cant be used intermingled with the backtracking engine.
23:38 dmq and your right, that is all localized small offsets.
23:39 dmq thanks. thats a useful observation.
23:39 TimToady indeed, I'm an acquaintance of the person who hacked the utf-8 matcher into regexec.c  :)
23:40 Psyche^ joined perl6
23:40 TimToady so basically Perl 6 has string positions as opaque markers or pointers
23:41 TimToady internally it can be a string plus a byte or codepoint offset, but that's hidden from the user's view.
23:42 dmq right but regexec.c can keep that kind of data on stack.
23:42 TimToady and you can force string positions and lengths back to numbers as long as you specify the units
23:42 TimToady yes, that's internal
23:42 dmq if you mean what i think you mean.
23:43 dmq so you can cheat when you end up with easy units like single width chars right?
23:43 TimToady yes, but only when you know it for sure.
23:44 dmq but with a dfa, everything is a codepoint/char. so you end up hypothetically building a scary stack.
23:44 TimToady Perl 5's type system is a bit dicey on the subject of knowing such a thing.
23:44 TimToady dfa's are always scarey. :)
23:45 dmq so i was thinking that if its offsets (therefore localized), and run length encoded, then you coudl do it for a dfa without worrying about the stacking blowing up.
23:45 dmq the units would be codepoints i guess.
23:45 Psyche^_ joined perl6
23:46 dmq im really interested in the idea of making as much of a regex happen using a dfa.
23:48 CardinalNumber joined perl6
23:49 TimToady everything in S05 about "longest token" is aimed at the same goal.
23:49 dmq yeah.
23:49 TimToady but it tends to make more sense for a parser than for a one-shot regex
23:49 TimToady which are often more efficient with a Boyer-Moore algorithm
23:50 dmq and i noticed that other semantics are chosen to make longest token not be super-expensive when each branch cant be handled via a dfa.
23:50 TimToady because the dfa is required to look at every character, and BM isn't
23:50 dmq right.
23:50 dmq the dreaded offsets code.
23:50 dmq i almost got lookbehind properly optimsable, but then my head exploded.
23:51 TimToady well, dfa is in the abstract side-effect free, and a parser wants to be full of side effects.
23:51 TimToady so you have to manage the transition from patterns to actions somehow
23:51 dmq yes
23:51 TimToady much like your typical awk statement
23:52 TimToady P6 requires reversibility on lookbehind patterns
23:52 TimToady (though that implies an encoding that can be scanned backwards too)
23:53 dmq heh
23:53 Psyche^_ is now known as Patterner
23:54 dmq it still would be nice to extract fixed substrings from them
23:54 dmq if possible.
23:55 dmq so that things like /(?<=foo)/ can be as efficient as /foo/
23:55 dmq i almost had it working.
23:55 TimToady it should match oof but run the counter the other way
23:56 dmq or it could just BM for 'foo'
23:56 dmq :-)
23:56 TimToady no, that's the wrong approach entirely
23:56 dmq and use the spot after it.
23:57 dmq or what about /(?<=foo)bar/ it should just look for 'foobar' and use the middle.
23:58 TimToady that's a possible optimization, but in the abstract it's not difficult to look for a position with oof going left and bar going right.
23:58 dmq i realize it doesnt scale when you add quantifiers, im talking about an optimisation only.
23:58 TimToady and the user nearly always has a good reason for having written it that way in the first place.
23:58 TimToady so you're almost never going to be able to do that optimization anyway
23:58 dmq i think mainly to not have bar in $& or $1
23:59 dmq actually that type of thing is quite common in split.

| Channels | #perl6 index | Today | | Search | Google Search | Plain-Text | summary

Perl 6 | Reference Documentation | Rakudo