Perl 6 - the future is here, just unevenly distributed

IRC log for #rosettacode, 2011-07-26

| Channels | #rosettacode index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:11 rodt joined #rosettacode
00:12 benbe2 left #rosettacode
01:32 kpreid left #rosettacode
02:18 kpreid joined #rosettacode
03:36 BenBE left #rosettacode
06:33 rodt left #rosettacode
06:57 damagednoob joined #rosettacode
06:59 damagednoob left #rosettacode
07:05 TimToady left #rosettacode
07:05 TimToady_ joined #rosettacode
10:22 FireFly joined #rosettacode
12:07 mikemol Ok, I've done my own stirring of the pot wrt [[Talk:Pi]] and [[Village Pump/Approximate fit solutions]].
12:07 fedaykin http://rosettacode.org/wiki/Talk:Pi  http://rosettacode.org/wiki/Village_Pump/Approximate_fit_solutions
12:16 BenBE joined #rosettacode
12:46 mikemol BenBE: snippets folder removed
13:14 BenBE mikemol thanks.
14:12 * mikemol terminated the CafePress account. Waiting on termination of the Zazzle and Text-Link-Ads accounts.
14:13 mikemol CP and Z accounts being terminated because they weren't worth it; I'd rather dedicate some shelf space and do sales and shipping directly than deal with their overhead.
14:14 mikemol TLA being terminated because I may begin handling that directly, too; I have someone who's interested, and TLA is too generalized to have a clue how to market to something like RC's users.
14:47 mwn3d_phone Just remember to update the finances page in case people wanted to go to the store
14:47 realazthat_ joined #rosettacode
14:51 realazthat left #rosettacode
15:04 * mikemol nods
15:08 dagnyscott1 joined #rosettacode
15:08 kpreid left #rosettacode
15:11 mwn3d_phone Don't worry about it for a bit if you're still busy with work. We don't need a frustrated and stressed admin.
15:13 realazthat_ is now known as trolzies
15:15 * BenBE ponders about releasing the RC code snippets separately from the language detection ... Otherwise it might take a while for the release ...
15:58 kpreid joined #rosettacode
16:04 TimToady_ is now known as TimToady
16:11 mikemol Zazzle account is terminated.
16:13 mikemol BenBE: Just a quick walkthrough on the legalese. I am not the copyright holder on those snippets, whoever submitted them to Rosetta Code is. They licensed them to me under the GFDL 1.2, and, by the terms of the GFDL, to you.
16:13 mikemol To my understanding of the legalese of the GFDL, as the copyright holder, they've not granted a permanent license; meaning they can revoke it.
16:14 mikemol What that means for me is that I may have to someday remove content from my site if someone revokes my license to it. I believe the same applies to you.
16:21 realazthat_ joined #rosettacode
16:21 trolzies left #rosettacode
16:31 * TimToady is working up to a new state-of-the-onion talk; is there a better way to download a snapshot of everything, or are things still in the state they were last time I downloaded a snapshot a couple months ago?
16:32 * TimToady would like to have a fresh snapshot on his machine in case the network fails...
16:35 MigoMipo joined #rosettacode
16:39 mikemol TimToady: Pretty much in the same position, sadly.
16:40 mikemol However, I may have a bit more assistance available this time.
16:40 mikemol Coderjoe: *poke*
16:41 TimToady thanks
16:42 mikemol Coderjoe: TimToady needs an offline copy of RC. Last time we tried this, I wasn't able to figure out the right settings to wget to get all the on-domain references converted to local relative.
16:42 mikemol Coderjoe: I know you've got expertise in this area...
16:45 TimToady a slightly related question is why a page like http://rosettacode.org/wiki/Category:Perl_6 is split into chunks of 200 or so, when the unimplemented page can easily list all of entries on the same page?
16:45 fedaykin "Category:Perl 6 - Rosetta Code"
16:46 TimToady the second page is harder to process into relative links than the first one
16:46 mikemol TimToady: Because the former is a category link that deals with things the way MediaWiki deals with categories, and the latter is a Mediawiki extension written by opticron.
16:47 * mikemol looks more closely at the second page
16:47 TimToady seems like MediaWiki would be be tweakable on that
16:47 mikemol I hadn't noticed a difference in how it forms URIs, but I hadn't been looking.
16:47 TimToady mainly it's hard to make the relative links between the two pages right
16:47 TimToady iirc
16:48 TimToady it's been a while since I postprocessed an rc
16:48 mikemol TimToady: Possibly. I had to find and fix a bug in MW which had hardcoded limits on result counts for the RecentChanges page. I'll see if I can find the setting to change the paging count for categories.
16:50 mwn3d_phone I don't see anything in the URL of the "next 200" link that could change that size (like in the recent changes URL)
16:50 mikemol Looks like this is the setting in question: http://www.mediawiki.org/wiki/Manual:$wgCategoryPagingLimit
16:50 fedaykin "Manual:$wgCategoryPagingLimit - MediaWiki" http://rldn.net/EfV
16:51 mikemol Changed to 2000.
16:51 mikemol That should last a while. :)
16:52 TimToady \o/
16:52 TimToady ^^ "yay" in P6-ese
16:53 TimToady it worked!  mikemol++
16:55 mikemol Incidentally, if there are any other configuration options people would like me to change, here's the list: http://www.mediawiki.org/wiki/Category:MediaWiki_configuration_settings
16:55 fedaykin "Category:MediaWiki configuration settings - MediaWiki" http://rldn.net/4sC
17:17 mwn3d_phone Also we're on 1.16.0 so you get to the right list
17:47 mwn3d_phone left #rosettacode
18:00 Coderjoe -p,  --page-requisites    get all images, etc. needed to display HTML page.
18:00 Coderjoe -k,  --convert-links      make links in downloaded HTML or CSS point to
18:00 Coderjoe local files.
18:01 Coderjoe note that -k happens at the end, after everything is downloaded
18:08 mikemol Running on server:
18:08 mikemol "wget -pkm http://rosettacode.org/"
18:08 mikemol Looks like it's currently CPU-bound inside PHP.
18:10 mwn3d_phone joined #rosettacode
18:23 mikemol So far, grabbed 95MB of data.
18:29 mikemol It'd be nice if wget used HTTP 300 redirects as an opportunity to create symlinks.
18:30 mikemol Sure, not all filesystems support symlinks. Then again, not all filesystems support filenames like "Category:Java"...
18:32 TimToady css-y things tend to get to get lost too, sometimes
18:47 rodt joined #rosettacode
19:12 mikemol TimToady: http://rosettacode.org/public/resources/2011-07-26-rosettacode.org.tar.bz2
19:12 fedaykin http://rldn.net/Ozxk
19:12 mikemol 71M
19:13 mikemol 300-some megs uncompressed.
19:13 mikemol That's a lot of text data.
19:13 rodt is that the entire site ?
19:13 mikemol rodt: Not including history, yeah.
19:13 rodt nice :)
19:13 mikemol Also not including the 'Special:' namespace, as that's blocked by robots.txt as well.
19:14 rodt whats that admin stuff ?
19:14 mikemol Among other things.
19:14 mikemol rodt: http://rosettacode.org/wiki/Special:SpecialPages
19:14 fedaykin "Special pages - Rosetta Code"
19:15 mikemol I don't know offhand what you'll see in there; my privs are different from yours, so I tend to see more things.
19:15 rodt no nasm ?
19:15 mikemol Feel free to correct that.
19:16 rodt :), ill prolly have some pics for you soon ;)
19:16 TimToady mikemol: thanks
19:16 mikemol np.
19:17 mikemol Please check it asap, so Coderjoe and I can figure out how to get a proper static copy set up.
19:17 TimToady mikemol: the link doesn't seem to work
19:17 mikemol Also, as long as you're doing postprocessing, a copy of your post script would be nice.
19:17 mikemol Hm
19:17 mikemol erp.
19:17 mikemol remove the /public/
19:18 mikemol So: http://rosettacode.org/resources/2011-07-26-rosettacode.org.tar.bz2
19:18 * mikemol wondered if the bot would try downloading the link.
19:18 mikemol I forget; who owns fedaykin?
19:18 mikemol [[Pi]]
19:18 fedaykin http://rldn.net/Dcu
19:18 fedaykin http://rosettacode.org/wiki/Pi
19:19 mikemol opticron: I think I broke your bot. Or came close.
19:19 opticron heh
19:19 opticron it spent a good 45 seconds on that link, so maybe
19:20 opticron yeah
19:20 opticron it totally pulled down the whole thing
19:20 opticron need to fix that
19:20 opticron with HTTP HEAD
19:21 opticron instead of GET
19:21 * mikemol nods
19:21 mikemol That's what I was thinking; and skip it if it doesn't match an understood Content-Type.
19:21 opticron right
19:21 mikemol Well, understood and *blessed*
19:22 mikemol I assume you have some kind of handler registry, and you don't want to be pulling down arbitrary links in strange channels on IRC...
19:22 opticron it's threaded though, so a link like that shouldn't kill it entirely
19:23 opticron it doesn't pull the file to disk anyway
19:23 mikemol I wonder
19:23 opticron it chews on memory and swap space and that functionality will die if it runs itself out of memory
19:23 opticron but even at that point I don't think the IRC interface will go offline
19:24 opticron I've had people link it to ISOs maliciosly before
19:24 mikemol I just put a symlink to /dev/zero in my /resources directory, but that didn't behave as expected when I tried wget. ^^
19:26 opticron fedaykin is a very modular bot, thankfully
19:26 opticron probably even overbuilt
19:30 kpreid left #rosettacode
19:31 kpreid joined #rosettacode
19:44 TimToady ah yes, it's coming back to me; also have to change Category: not to look like an http: to firefox, and rename some bare files to .html
19:45 * TimToady wonders if he left a program sitting around to do that...
19:48 BenBE mikemol I know. The language detection and the snippets are two separate things in that way that I do not "link" against them. I.e. I only create a work based on them ("the trained detection mechanism") that I always can also start of using other input.
19:49 mikemol BenBE: What I was pointing out is that you may eventually be subject to a takedown, same as I may be.
19:49 BenBE What I was pondering about (more or less) was if I did a "release" of the gathered files for everyone beforehand (so they can play with them separately) or release them by the time I got the detection stuff working to some degree.
19:50 mikemol (An individual contributor may decide to revoke his license)
19:51 BenBE TimToady if you just need the sorted code samples (by language; no link to the origin page though) I can give y<you a fresh copy ;-)
19:51 mikemol I *think* the takedown to you would come from the contributor, but I'm uncertain. If it ever got around to that, I'll have to pay for a lawyer to get real advice.
19:51 mikemol BenBE: I imagine he could use the perl6 dir, but the Perl 6 examples tend to contain a signficant in-wiki, out-of-code preamble describing things.
19:53 TimToady BenBE: the task pages come up nicely, which is most of what I need for my talk, but thanks
19:54 TimToady though if it's true to form, all the css will go away if I drop of the network...
19:54 TimToady *off the
19:58 TimToady yup, loses all the css
19:58 TimToady still, it beats a kick in the head
19:59 BenBE mikemol Yeah. In such cases I'd have to remove those single files from the distribution (German law affecting me here: Only if the work I have to do for it is reasonable to do so).
20:00 BenBE mikemol: Meaning: If someone asks for a single file to be taken out of the archive I've to do it. If I only get "this snippet" and had to look for it I usually wouldn't have to. Same with the whole archive: Since the one doesn't have the authority on it.
20:01 TimToady Coderjoe: would a -E help wit the css, or is just not available to wget?
20:02 TimToady or mikemol ^^
20:02 BenBE mikemol What about the Perl6 stuff? I'm not sure what you are going to point out there; at least I'm not sure.
20:03 mikemol BenBE: My thought was that TimToady might find the raw perl6 code snippets handy, as he wouldn't have to extract them from a large HTML page.
20:03 BenBE TimToady Wouldn't be a DB dump be all you needed; removing e.g. User accounts and other private information?
20:03 mikemol BenBE: If I could do that kind of db dump easily, I'd be publishing it monthly.
20:04 BenBE ah, k. Unfortunately I have only the raw code; not the pages they belong to (and also no ref to reconstruct this)
20:05 TimToady we might find the raw snippets useful for regression testing, but probably not so much for public show
20:13 mikemol TimToady: Running with -pkmE
20:55 mikemol TimToady: Grab that URL again
20:55 mikemol This one's 143M. Not certain why the difference.
21:05 BenBE Seems kinda slow DL  ... only ~60 KBps ...
21:06 dagnyscott1 left #rosettacode
21:09 mikemol BenBE: Dunno what's going on, there. I'm pushing 855Kb/s, according to iftop.
21:09 mikemol Up to just shy of 1Mb/s
21:20 MigoMipo left #rosettacode
21:42 BenBE Well, it's done on my site ...
21:46 slavik left #rosettacode
21:47 slavik joined #rosettacode
22:00 Coderjoe I'm not sure what tags -p checks for page requisites. it likely doesn't catch css @includes
22:32 rodt lol number conversion is orgasmic
22:33 rodt well converting from base n is
22:35 mwn3d_phone Lol I think you like math in the wrong ways
22:37 rodt no really, its obvious really, you start raising to power of number of digits - 1, and raise to less and power
22:37 rodt it gets faster and faster, and then then so fast it blurs and "emits" the result
22:38 rodt i can see algorithms process in my lang, and was just watching it :P
23:08 mikemol rodt: You should export animated gifs, if you can actually watch the process. Alternately, encode a video or some such. I'd be happy to help you with the tools and techniques for video encoding. :)
23:15 TimToady the -E still loses all the css, it would appear...
23:15 Hypftier theoretically you could put the animation inside an SVG (albeit rsvg won't make an animated GIF from it)
23:16 Hypftier TimToady: didn't wget forget CSS and resources linked from there anyway when mirroring?
23:16 TimToady we were trying -E to see if it preserved css maybe
23:17 Hypftier Ah. To my knowledge wget was stuck in HTML 3 times or so for mirroring. HTTrack might be an alternative specifically for mirroring websites.
23:18 TimToady grepping through, it still has lots of http css refs into /mw
23:20 Hypftier Hm, from the docs, -p should be enough.
23:25 TimToady stuff from /mw/skins, and things like: <link rel="stylesheet" href="http://rosettacode.org/mw/index.php?title=MediaWiki:Common.css&amp;amp;usemsgcache=yes&amp;amp;ctype=text%2Fcss&amp;amp;smaxage=18000&amp;amp;action=raw&amp;amp;maxage=18000" />
23:25 mikemol Guys, throw whatever you care to try at the server. As long as it respects robots.txt, the server should handle it just fine.
23:25 mikemol I'm occupied for the evening. >.>
23:25 Hypftier Why so generous all of a sudden with server resources? Upgraded it?
23:26 mikemol Hypftier: Over a year ago. With the squid proxy, it'll handle just about anything that doesn't mutate pages.
23:26 TimToady he's going to bed :)
23:26 mikemol TimToady: Not quite. Have company over for the evening. Regularly Tuesday thing.
23:27 TimToady ah...why aren't you paying attention to them, then?  :)
23:28 mikemol Because I can say a word or two, and they'll go back and forth for five minutes. But I stil need to listen. Which is why I'm off for the evening after opening the floodgates to your mirror tools. :)
23:28 Hypftier Why the sudden RC mirroring, by the way?
23:28 TimToady I'm giving a talk on Thu night
23:29 Hypftier No internet access there you want to rely on? :)
23:29 TimToady you got it
23:29 TimToady mind you, OSCON does better than they used to
23:30 TimToady but it's sort of a trend that I refer to RC examples in various venues
23:31 TimToady it's a symbiotic relationship that is good for both RC and any languages that look good on RC, of which I'm thinking of one in particular :)
23:31 TimToady well, maybe two :)
23:32 * Hypftier is still pondering using RC in his diploma thesis
23:34 TimToady I might at least be able to wget mw/skins
23:39 TimToady nah, I think robots.txt disallows it
23:39 TimToady well, might be able to get the one by one
23:43 mwn3d_phone Hypftier: what's your thesis about?
23:46 Hypftier Dynamically composing text from a author-defined model according to reader-specified criteria. The intent is to have readable and coherent linear texts in the end (though not completely automated; coherence is something the author has to take care of).
23:46 Hypftier I thought maybe one could build some kind of "Programming introduction" from RC; criteria could be the programming language one is interested in, for example. But I fear there's not too much beyond that and it wouldn't make a good example.
23:47 Hypftier Maybe, but that would need a bit more tweaking of content, one could make a PL one already knows a criterion and the resulting text could adapt to that. Still pondering :)
23:48 mwn3d_phone Maybe you could use a programming topic like recursion. It has a category and some encyclopedic text to go with it.
23:48 mwn3d_phone Or maybe I don't quite understand what kind of information you need to read and generate
23:49 Hypftier I'm not quite sure I can explain it coherently, either. I'm writing in German too which certainly doesn't help in explaining it in English :)
23:50 mwn3d_phone Heh fair enough
23:50 mwn3d_phone I'll read about it in the computing history books of the future
23:50 Hypftier nah, I doubt it ;)
23:50 mwn3d_phone They'll have people who are paid to translate it
23:50 Hypftier In fact, similar things have been done already.
23:51 Hypftier I just need to explain properly why I'm different :P
23:51 Hypftier Basically I'm considering two actors: Author and reader. The author would build a model of different text variants (comprised of fragments that are composed together based on the criteria [which select the variants]). An early prototype (German; sorry) was http://hypftier.de/temp/turing.html based on the WP article on Turing machines.
23:51 fedaykin "Turingmaschine"
23:53 Hypftier It has only a single (very vague) criterion which roughly models reader knowledge in theoretical CS and the implementation was a two-day hackjob, but at least I got changing text.
23:56 mwn3d_phone So is it transforming text written by an author according to another reader's preferences?
23:56 Hypftier Basically, yes. Except that the author has to take care of the possible transformations, as well. That's hard to do automatically.
23:57 mwn3d_phone Hm. Cool.
23:57 Hypftier A very early idea was that Wikipedia could implement a slider for users that allows them to vary the content they see between "encyclopedic" and "trivia". It would only need properly marked-up content ;). From there it ... kinda grew.
23:57 mwn3d_phone Getting computers to do proper things with natural language has been a problem forever. Any incremental progress we can get is good.
23:58 mwn3d_phone Aw man that'd be awesome
23:58 mwn3d_phone Simple wikipedia tries to get the bottom of that slider down
23:59 mwn3d_phone Its a nice website as long as it has the article you're looking for
23:59 Hypftier (Wikipedia users constantly discuss notability policies endlessly and I think implementing such a thing where readers could just decide for themselves what they want to read would be much more beneficial than the eternal discussions what to include and what not ;)

| Channels | #rosettacode index | Today | | Search | Google Search | Plain-Text | summary