Perl 6 - the future is here, just unevenly distributed

IRC log for #rosettacode, 2012-01-13

| Channels | #rosettacode index | Today | | Search | Google Search | Plain-Text | summary

All times shown according to UTC.

Time Nick Message
00:30 mikemol joined #rosettacode
00:37 mwn3d_phone1 joined #rosettacode
01:23 mwn3d_phone joined #rosettacode
03:43 mikemol Gah.
03:43 mikemol RC is being regularly taken down by a normalish user who I don't think is doing it intentionally.
03:44 mikemol User agent is "Links (2.3pre1; Linux 3.0.0-14-generic i686)", but he hits with up to a dozen requests per second.
03:46 sorear huh, so I'm not the only person in the world using links
03:46 sorear (it can't be me; I haven't visited RC this year)
03:46 mikemol I use it on occasion, too. Not for much, though.)
03:46 mikemol Well, not much I can do now except note the IP and figure out what to do about it later.
03:47 mikemol Gotta restart squid, apache, memcached and mysql to get crap out of swap, though.
03:48 sorear what?
03:49 mikemol It triggered an OOM on some process, shunting a bunch of other stuff into swap as it approached that.
03:49 mikemol Looks like I don't have mpm_prefork configured properly on the new server yet. I'll have to fix that.
03:49 mikemol Until I restart the primary services, though, RC acts dog slow.
03:50 sorear so sending a dozen requests in a second is a bad idea, huh.
03:51 sorear if there's a maximum number of concurrent MediaWiki requests that can be safely handled, I wonder how hard it would be to have squid use a semaphore
03:51 sorear *ramble*
03:51 mikemol *normally* it's not a problem; almost everything is cached by squid. Except on this occasion (and one or two others), rather than requesting http://rosettacode.org/wiki/PAGENAME, they requested http://rosettacode.org/mw/index.php?title=PAGENAME.
03:51 fedaykin "PAGENAME, - Rosetta Code" "PAGENAME. - Rosetta Code" http://rldn.net/16ln5
03:52 mikemol The former is the vastly more common form. There's no valid reason anything should request the latter in an automated fashion...that's even disallowed by robots.txt, as it usually represents expensive history walking.
03:52 sorear Does RC have a database dump download link?
03:52 mikemol But, anyway, the problem is fundamentally with my apache configuration; I'm allowing apache to spawn too many processes in burst scenarios.
03:52 sorear I agree
03:53 sorear no amount of traffic should ever crash a server.  Starve other users, yes
03:54 mikemol sorear: No, there's no db download link. I intend to set one up eventually, but I need to set up automated stripping of private data like email addresses, passwords, etc.
03:55 mikemol Then again, I've intended to set up such a thing almost since I moved RC to a VPS I pay for.
03:55 sorear and until then, do people with a statistical bent have any options beside scraping?
03:57 mikemol MW API
03:58 mikemol Not enabled for users by default, but an account need only get added to the "Bots" group.
03:59 mikemol But, seriously, page history crawling is *expensive*. MW stores history by iterative diffs off of the current state.
04:36 deltree_ joined #rosettacode
06:04 mischi joined #rosettacode
12:00 GlitchMr joined #rosettacode
13:02 mischi joined #rosettacode
13:19 mwn3d_phone joined #rosettacode
14:03 kpreid joined #rosettacode
15:24 GlitchMr42 joined #rosettacode
16:41 kpreid joined #rosettacode
16:54 kpreid joined #rosettacode
17:55 kpreid joined #rosettacode
18:16 GlitchMr joined #rosettacode
18:21 mischi joined #rosettacode
18:44 kpreid joined #rosettacode
19:17 kpreid joined #rosettacode
19:19 robbrit joined #rosettacode
19:34 mikemol I've done a fair amount of tweaking serverside, and I need to give it some load.
20:39 mwn3d_phone joined #rosettacode
20:45 robbrit left #rosettacode
21:07 kpreid joined #rosettacode
21:29 mischi joined #rosettacode
23:13 FireFly joined #rosettacode
23:13 mwn3d_phone joined #rosettacode
23:38 Coderjoe_ joined #rosettacode

| Channels | #rosettacode index | Today | | Search | Google Search | Plain-Text | summary